HazyResearch / hyena-dna

Official implementation for HyenaDNA, a long-range genomic foundation model built with Hyena
https://arxiv.org/abs/2306.15794
Apache License 2.0
599 stars 83 forks source link

CUDA out of memory occurs when the training length reaches 450k on a100 #73

Open luoshengtangxiademao opened 4 months ago

luoshengtangxiademao commented 4 months ago

CUDA out of memory occurs when the training length reaches 450k on a100 (80G).I used the huggingface version, hyenadna-medium-450k-seqlen-hf. I'm trying the species classification task.Is the version on huggingface optimized? I don't seem to see flash-attn used.

CHNFTQ commented 3 months ago

I have the same issue when trying to train the hyenadna-medium-450k-seqlen model with a DNA sequence of 256k.