Closed yufengwhy closed 3 months ago
Hi!
We are sorry we cannot provide such script because all our data preprocessing aims to engage AF2 predicted structures into our training dataset. But ESM-2 leveraged only sequence data for pre-training. We recommend refering the original paper of ESM-2 for more details, including both training hyperparameters and dataset construction.
Best regards, Jin
Could we use this to reproduce the pretraining of SaProt_650M_AF2?
python scripts/training.py -c config/pretrain/saprot.yaml
Got this error, any ideas?
huggingface_hub.utils._validators.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': 'weights/PLMs/SaProt_650M_AF2'. Use
repo_type
argument if needed.
I guess that's because you didn't put the checkpoint at right directory. You could move the SaProt checkpoint to weights/PLMs
and try again.
Could we kindly provide the config and data preprocessing script to reproduce the pretraining of esm2_t33_650M_UR50D ?