can't run inference with atacworks pretrained model within the container on A100

Hi! I'm trying to run this notebook https://github.com/NVIDIA-Genomics-Research/rapids-single-cell-examples/blob/master/notebooks/5k_pbmc_coverage_gpu.ipynb within the container https://hub.docker.com/r/claraparabricks/single-cell-examples_rapids_cuda11.0 on A100 GPU.

Everything works until executing this line

atacworks_results = coverage.atacworks_denoise(noisy_coverage, model, gpu, interval_size)

which gives error:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<timed exec> in <module>

~/run_singlecell_rapids/rapids-single-cell-examples/notebooks/coverage.py in atacworks_denoise(coverage, model, gpu, interval_size, pad)
    353         input_arr = torch.tensor(input_arr, dtype=float)
    354         input_arr = input_arr.unsqueeze(1)
--> 355         input_arr = input_arr.cuda(gpu, non_blocking=True).float()
    356         # Run model inference
    357         pred = model(input_arr)

RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

I tried to update the pytorch installation in the container by

conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia

But it takes forever to solving environment. Any idea? Thanks!

NVIDIA-Genomics-Research / rapids-single-cell-examples

can't run inference with atacworks pretrained model within the container on A100 #110