Closed salvatoreloguercio closed 6 months ago
It might be related to cuda 1.17: see https://discord.com/channels/1125706816479821874/1125706817016696926/1128087480021823518
Where you able to solve this? I run into similar issues, randomly while doing inference (I cannot access the discord link btw):
return forward_call(*args, **kwargs) [499/1100]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/.xdg_cache_gbenegas/huggingface/modules/transformers_modules/LongSafari/hyenadna-large-1m-seqlen-hf/8eb99a87c0bbaf0fec9346d72c60360c3a5b9e33/modeling_hyena.py
", line 158, in forward
y = fftconv(x, k, bias)
^^^^^^^^^^^^^^^^^^^
File "/tmp/.xdg_cache_gbenegas/huggingface/modules/transformers_modules/LongSafari/hyenadna-large-1m-seqlen-hf/8eb99a87c0bbaf0fec9346d72c60360c3a5b9e33/modeling_hyena.py
", line 26, in fftconv
y = torch.fft.irfft(u_f * k_f, n=fft_size, norm='forward')[..., :seqlen]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: cuFFT error: CUFFT_INTERNAL_ERROR
No, I ended up using the model for fewer iterations, then reloading the image. Just a workaround.
Hello, I am using a slightly modified version of the huggingface.py script to generate embeddings from fasta files. I am using the largest model (1Mb window size), and running it on a A100 80Gb.
I just added a loop ad the end of the huggingface.py which loads fasta files and gets embeddings:
However, after a few hundred iterations I get the following CUFFT error, which seems related to out of memory issues:
So I was wondering, if there is a way to flush the memory between iterations, in order to prevent this kind of error? Thanks!