Closed AaranWang closed 1 month ago
Hi,
If you want to reproduce the training process of SaProt_35M_AF2, you just need to use the LMDB file of AF2_UniRef50. Can you explain more about the raw data?
I originally thought that the raw data of AF2_Uniref50 was used to train SaProt_650M_AF2. Isn't that the case?
We used the LMDB file of AF2_UniRef50 to train both SaProt_650M_AF2 and SaProt_35M_AF2.
How can I determine whether to train the SaProt_650M_AF2 or SaProt_35M_AF2 model? Additionally, is the cost of training the SaProt_35M_AF2 relatively lower compared to training the SaProt_650M_AF2? Thank you for your kind reply.
The model is loaded based on its config path. If you want to train SaProt_35M_AF2, you only have to switch the config path to point to SaProt_35M_AF2. Training SaProt_35M_AF2 is much easier than the 650M, but it still took two weeks on 8 A100 GPUs.
Is this SaProt/config/pretrain/saprot.yaml file i should modity? And change the SaProt_650M_AF2 item to SaProt_35M_AF2? Is there other options should i modify or can you provice a modified yaml file? Thank you very much.
Yes. You have to change the 650M to 35M and keep other setting default. Please make sure the lmdb path(i.e. train_lmdb etc.) is correct on your server.
Thank you. Best wishes to u. (^.^)
You are welcome! 😃😃
As a newcomer in this area, I‘m aiming to train the simplest DL model following the training process of SaProt_35M_AF2. I only have access to the source data (data.mdb) of AF2_Uniref50. Can you provide the raw data of SaProt_35M_AF2? Thx