facebookresearch / esm

Evolutionary Scale Modeling (esm): Pretrained language models for proteins
MIT License
2.97k stars 586 forks source link

Inverse folding samples from all tokens instead of just amino acids #633

Open spenceforce opened 8 months ago

spenceforce commented 8 months ago

NOTE: if this is not a bug report, please use the GitHub Discussions for support questions (How do I do X?), feature requests, ideas, showcasing new applications, etc.

Bug description GVPTransformerModel.sample samples from all tokens including \<eos>, \<mask>, etc.

Reproduction steps Run the example inverse folding script for sequence sampling with a high temperature. I've seen these tokens appear with a temperature as low as 3, but try 10.

Expected behavior I expect the tokens sampled to be only for amino acids since only amino acids make sense as outputs for structure residues.