agentmorris / MegaDetector

MegaDetector is an AI model that helps conservation folks spend less time doing boring things with camera trap images.
MIT License
116 stars 24 forks source link

`CUDA error: CUBLAS_STATUS_NOT_SUPPORTED` on Kubuntu 23.04 x86_64 with NVIDIA GeForce RTX 3090 Ti #106

Closed PetervanLunteren closed 1 year ago

PetervanLunteren commented 1 year ago

@agentmorris An EcoAssist user has found a CUDA error while running inference on GPU. See https://github.com/PetervanLunteren/EcoAssist/issues/16 for more information.

I don't think it is an EcoAssist error, since the traceback originates from run_detector_batch.py. It seems that it can be caused if there is a mismatch between the dimension of the input tensor and the dimensions of the nn.Linear module. Any idea what is going on?

The environment is created with the YAML file below.

channels:
  - conda-forge
  - pytorch

dependencies:
  - python=3.8
  # We pin Pillow to make it as likely as possible that images are loaded via a loader that's identical to the training environment
  - Pillow=9.1.0 
  - tqdm
  - jsonpickle
  - humanfriendly
  - numpy
  - matplotlib
  - opencv
  - requests

  # So we can run Jupyter notebook this environment
  - nb_conda_kernels
  - ipykernel

  # For running MegaDetector v5
  - pandas
  - seaborn>=0.11.0
  - PyYAML>=5.3.1
  - pytorch::pytorch=1.10.1
  - pytorch::torchvision=0.11.2
  - conda-forge::cudatoolkit=11.3
  - conda-forge::cudnn=8.1

  # For running MegaDetector v4
  # - tensorflow>=2.0  
agentmorris commented 1 year ago

Thanks for the heads-up. I'm going to close this issue and respond to the issue on the EcoAssist repo.

agentmorris commented 1 year ago

For posterity: this was resolved by running:

export LD_LIBRARY_PATH=''

...prior to starting EcoAssist.