Pretraining runtimes from the paper

HazyResearch / hyena-dna

Official implementation for HyenaDNA, a long-range genomic foundation model built with Hyena

Apache License 2.0

532 stars 74 forks source link

Hi! Great work, and also great youtube presentation, thanks for making that public.

I have a question about the runtimes. In the Table A.2 it says that pre-training took 80min for the model with 1.6M parameters. When I pretrain on my dataset the model with 3.3M parameters (input size 16k, 3 Hyena layers, emb.dim. 256) it takes me around 16 hours for the dataset with only 21000 samples. Anything is wrong with my setup? Could you please specify more explicitly, what data size went into the Table A.2? Like, how many samples of which sequence length with what batch size. And if it's possible to tell, what share of nucleotides of human genome the pretrained model (like, the one with the batch size 32k) ended up seeing?

Thank you for the nice work!

HazyResearch / hyena-dna

Pretraining runtimes from the paper #46