kuleshov-group / caduceus

Bi-Directional Equivariant Long-Range DNA Sequence Modeling
Apache License 2.0
148 stars 23 forks source link

1.9million parameters, 470K parameters Caduceus model query #8

Closed MarcAmil30 closed 6 months ago

MarcAmil30 commented 6 months ago

In the paper, you conducted benchmark test on GenomicBenchmarks and Nucleotide Transformer dataset with the Caduceus model using the 470K and 1.9M parameter size model. However in the huggingface it only show the 7M parameter model. I wanted to ask how you got the 470K and 1.9M parameter size model and is this model readily available?

Thanks

yair-schiff commented 6 months ago

Hi @MarcAmil30, we only released the biggest models that we trained, i.e. the 7.7M param one, which was trained for 50k steps.

The two other ones that you mention have the following configurations:

Since these models are smaller and were trained for only 10k steps, we didn't make them public. For reference, when training on a node with 8 (A5000) GPUs these training runs took around 70 minutes for the 470k model and 200 minutes for the 1.9M model. If you are struggling to re-train these, please let me know and we can perhaps coordinate sharing the weights.