Open neuwirtter opened 3 months ago
Hi Tereza, thanks a lot for your interest in our method. That is absolutely doable and I already made good experience in using this to avoid generation of e.g. Glycin or Alanin. You can simply expand the token_ids passed via the bad_words_ids given here: https://github.com/mheinzinger/ProstT5/blob/main/scripts/translate.py#L196 (simply add the the AAs you want to avoid to generate and the model should not produce them anymore) Best, Michael
Hi,
I would like to generate versions of existing proteins with your tool that are lacking one amino acid in their sequence (using alphabet of 19 amino acids). Do you think it is possible when generating sequence from 3Di representation to limit the alphabet somehow?
Thank you in advance,
Tereza