Closed zhenyuhe00 closed 1 year ago
@ebetica That's a very interesting question. Can you answer it?
Hey @zhenyuhe00, we crop the sequence during training. Not cropping would be prohibitively expensive in many cases.
For inference, we do it offline, and never crop. We tried using the disordered residues (provided in FASTA but not PDB) for a small overall improvement (<1 lddt)
thanks !
Hi, Congrats on your great series of work! the crop size of ESM-Fold is 384 during training. However, I wonder when conducting inference to get embeddings from ESM-2, is the sequence fed to ESM-2 also cropped to 384 or it's the full sequence? The former case may downgrade performance since the context information is cropped. Besides, I'm curious did you conduct inference offline to storage or conduct inference online when training ESMFold?
Thanks in advance!