CUDA runtime error - Githubissues

hury07 commented 3 years ago

NOTE: if this is not a bug report, please use the GitHub Discussions for support questions (How do I do X?), feature requests, ideas, showcasing new applications, etc.

Bug description Please enter a clear and concise description of what the bug is. RuntimeError: CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling cublasCreate(handle)

environment: Pytorch 1.9.0 (py3.7_cuda10.2_cudnn7.6.5_0) This error did not occur in the previous ESM version, the version without ESM-1v

Reproduction steps Enter steps to reproduce the behavior.

Expected behavior Give a clear and concise description of what you expected to happen.

Logs Please paste the command line output: Traceback (most recent call last): File "main.py", line 209, in main() File "main.py", line 204, in main explorer.run(landscape, verbose=False) File "/home/hury/local/bio/fasthit/fasthit/explorer.py", line 171, in run encodings = self._encoder.encode(training_data["sequence"].to_list()) File "/home/hury/local/bio/fasthit/fasthit/encoders/esm.py", line 99, in encode extracted_embeddings[i] = self._embed(temp_seqs, combo_batch) File "/home/hury/local/bio/fasthit/fasthit/encoders/esm.py", line 130, in _embed toks, repr_layers=[self._pretrained_model.num_layers], return_contacts=False File "/home/hury/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, kwargs) File "/home/hury/local/bio/fasthit/fasthit/encoders/esm/esm/model.py", line 160, in forward x, self_attn_padding_mask=padding_mask, need_head_weights=need_head_weights File "/home/hury/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, *kwargs) File "/home/hury/local/bio/fasthit/fasthit/encoders/esm/esm/modules.py", line 130, in forward attn_mask=self_attn_mask, File "/home/hury/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(input, kwargs) File "/home/hury/local/bio/fasthit/fasthit/encoders/esm/esm/multihead_attention.py", line 223, in forward v_proj_weight=self.v_proj.weight, File "/home/hury/.local/lib/python3.7/site-packages/torch/nn/functional.py", line 4702, in multi_head_attention_forward q = linear(query, q_proj_weight_non_opt, in_proj_bias[0:embed_dim]) File "/home/hury/.local/lib/python3.7/site-packages/torch/nn/functional.py", line 1753, in linear return torch._C._nn.linear(input, weight, bias) RuntimeError: CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling cublasCreate(handle)

Output goes here

Additional context Add any other context about the problem here. (like proxy settings, network setup, overall goals, etc.)

tomsercu commented 3 years ago

Sounds like either something got broken in your environment, or most likely the wrapper library put things in an unexpected state. One guess: is the input not moved to cuda device while the model is, or vice versa? Can you confirm whether or not you can run pytest tests/test_readme.py ? (which calls extract.py which uses GPU when available)

hury07 commented 3 years ago

Thanks a lot! It is that something got broken in my environment. Exactly!

facebookresearch / esm

CUDA runtime error #107