Question about computing resource and batch size

jasperhyp commented 1 year ago

Hi,

Thanks for sharing the code. I noticed in your run_pretrain.sh, the batch size of protein-GO and protein MLM is 8, while the batch size of GO-GO is 64. Meanwhile, the num of negative samples for each positive sample is 128, or 256 for GO-GO.

(1) Does this mean in each GO-GO pass, at most (64*2+64*256) samples of length at most 128 are fed into the GO encoder (in one batch)?

(2) How many V100s did you use for this pretraining?

Also, I noticed that you didn't permutate proteins for protein-GO relations.

(3) Is this due to computing resource limit (i.e. 8*128 is just too large a number for proteins)?

(4) Did you experiment with a lower number of negative samples while considering such protein permutation?

Thanks in advance!

Alexzhuan commented 1 year ago

Hi,

(1) In one batch, 64 positive samples and 64 128 2 negative samples are input into the GO encoder.

(2) We use 4 * V100s to pretrain the model.

(3) Due to the limitation of computing resources, we didn't permutate proteins for protein-GO relations.

(4) The number of negative samples is an important hyperparameter, but in this work, we didn't search for the optimal value.

jasperhyp commented 1 year ago

Thank you for providing the information! Just one more question: With that GPU budget and training step (I saw in run_pretrain.sh that the max step is 60000), how long did it take you to train the model? Please don't feel pressured to answer, I totally understand as it was a long while ago. But in case you remember, it would be very helpful if I could get a rough estimate!

Alexzhuan commented 1 year ago

It took about a week to pre-train the model.

jasperhyp commented 1 year ago

Thank you for your kind help!

zjunlp / OntoProtein

Question about computing resource and batch size #24