kuleshov-group / caduceus

Bi-Directional Equivariant Long-Range DNA Sequence Modeling
Apache License 2.0
137 stars 14 forks source link

About pretrain GPUS #42

Closed wyhsleep closed 2 weeks ago

wyhsleep commented 3 weeks ago

Dear authors, thank you for your wonderful work, may I ask how many GPUs (A100) you used for pretraining?

Skylion007 commented 2 weeks ago

Max we ever used 8, most experiments were done on A6000 or other consumer experiments.

lucaskbobadilla commented 1 week ago

Similar question: How long it took to pre-train the model using the 8 A100?

yair-schiff commented 1 week ago

Roughly 7-8 hours