Wonder about the training phrase (whole required training time, etc)

HannesStark / EquiBind

EquiBind: geometric deep learning for fast predictions of the 3D structure in which a small molecule binds to a protein

MIT License

473 stars 109 forks source link

Wonder about the training phrase (whole required training time, etc) #14

Closed CiaoHe closed 2 years ago

CiaoHe commented 2 years ago

Very impressive work and clean codes! I just try to retrain the whole model based on the same training settings you recommend (whole train and val set). However, each epoch needs at least 5 mins to finish (batchsize:32, on A100 GPU), I was shocked when saw the total epochs is 100k. So, may I ask how long did you train the whole model and how many resources need?

Best,

HannesStark commented 2 years ago

Thanks for the kind words!

I had marginal improvements by training for a long time (~5 days) but usually, the validation curves almost plateau after ~40h of training (not that that is not a long time either).

The batch size I used for that was 8.

CiaoHe commented 2 years ago

Thanks for your reply. So 5 days of training means the total epoch is around 1k?

HannesStark commented 2 years ago

In one of my typical tensorboard plots, 100 epochs, which are around 200.000 iterations, take 13h 40min (batchsize 8). Sorry for the delayed reply!

MatthewMasters commented 1 year ago

@HannesStark Do you know how long the models in the paper were trained for? 100 epochs?