HazyResearch / hyena-dna

Official implementation for HyenaDNA, a long-range genomic foundation model built with Hyena
https://arxiv.org/abs/2306.15794
Apache License 2.0
574 stars 82 forks source link

Reproducing the HyenaDNA results on NT Benchmarks #59

Open leannmlindsey opened 6 months ago

leannmlindsey commented 6 months ago

Hello,

I have been unsuccessful at reproducing your NT Benchmarks from just running the tests in my environment on the NT datasets. I would like to be able to see the parameters that you used but unfortunately, our CHPC system does not allow Docker, they recommend Apptainer. I have been able to successfully run HyenaDNA on my own datasets and on the NT datasets, I am just not getting the same results that you report in the paper.

I have gotten this far using the instructions in the apptainer thread (see below):

But I can't seem to see the file that you speak about in the instructions: "This will land you inside the /wdr, which has a file named launch_commands_nucleotide_transformer with all the launch commands and (associated hyperparameters) for the 18 Nucleotide Transformer datasets."

Would it be possible for you to release this file (launch commands and associated hyperparameters) outside of the Docker container for those of us that have had challenges reproducing your results on HPC systems that do not allow Docker?

Thank you, LeAnn

(A100-3-hyena-dna) [u1323098@notch372:hyena-dna]$ apptainer exec --nv hyena-dna-nt6.sif /bin/bash

INFO:    gocryptfs not found, will not be able to use gocryptfs

INFO:    underlay of /etc/localtime required more than 50 (96) bind mounts

INFO:    underlay of /usr/bin/nvidia-smi required more than 50 (475) bind mounts

13:4: not a valid test operator: (

13:4: not a valid test operator: 550.54.14

Apptainer>Apptainer> ls -lrt total 17203372 -rw-r--r-- 1 u1323098 sundar 11357 Dec 5 09:52 LICENSE -rw-r--r-- 1 u1323098 sundar 407 Dec 5 09:52 Dockerfile -rw-r--r-- 1 u1323098 sundar 35655 Dec 5 09:52 README.md drwxr-xr-x 2 u1323098 sundar 34 Dec 5 09:52 assets drwxr-xr-x 13 u1323098 sundar 244 Dec 5 09:53 configs drwxr-xr-x 3 u1323098 sundar 29 Dec 5 09:53 csrc drwxr-xr-x 2 u1323098 sundar 155 Dec 5 09:53 evals -rw-r--r-- 1 u1323098 sundar 8622 Dec 5 09:53 huggingface.py -rw-r--r-- 1 u1323098 sundar 530 Dec 5 09:53 requirements.txt drwxr-xr-x 8 u1323098 sundar 121 Dec 5 09:53 src -rw-r--r-- 1 u1323098 sundar 42633 Dec 5 09:53 standalone_hyenadna.py -rw-r--r-- 1 u1323098 sundar 27561 Dec 5 09:53 train.py drwxr-xr-x 9 u1323098 sundar 4096 Dec 5 17:06 flash-attention drwxr-xr-x 2 u1323098 sundar 42 Dec 5 17:07 pycache drwxr-xr-x 4 u1323098 sundar 55 Dec 5 17:17 data drwxr-xr-x 4 u1323098 sundar 54 Jan 13 13:54 outputs drwxr-xr-x 10 u1323098 sundar 4096 Jan 14 08:52 wandb -rwxr-xr-x 1 u1323098 sundar 4279 Mar 13 15:17 finetune_model_test.py -rwxr-xr-x 1 u1323098 sundar 9645101056 Mar 25 07:28 hyena-dna-nt6.sif -rwxr-xr-x 1 u1323098 sundar 7970955264 Mar 25 07:29 hyena-dna.sif Apptainer>