dauparas / ProteinMPNN

Code for the ProteinMPNN paper
MIT License
1.05k stars 307 forks source link

What's the meaning of "/" in the output .fasta file in ProteinMPNN? #115

Open LinnjiaWen opened 1 month ago

LinnjiaWen commented 1 month ago

There was "/" in the sequence of the the output .fasta file, does anyone know why this happened, like:

>test, score=1.7249, global_score=1.7249, fixed_chains=[], designed_chains=['A', 'B'], model_name=v_48_020, git_hash=8907e6671bfbfc92303b5f79c4b5e6ce47cdef57, seed=3
GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG/AKGADVNASDYKGTTPLHVAAWNGHLEIVDVLLARGADINASDSYGDTPLHLAANYGHLEIVDLLLRWGADVNASDSSGKTPLHLAAQDGHLEIVDVLLAHGADVNAQDKFGKTPFDLAIDNGNEDIAEVLQK
>T=0.1, sample=1, score=0.7861, global_score=0.7861, seq_recovery=0.4078
SLEELLRREACERADAEALARAAELVAAARAEAEAARAARAAAEAAAAAAAAAAAAAAAAAAAAALAAALAAA/CDGTDVNAAAPSGKTPLHIAAEKGDLEAVVRLLARGADVNAKDKEGNTPLHLAARNGHLDIVVVLLKAGADVNAKDKEGKTPLHWAAENGHLDIAKVILKAGGDVNAKDKEGKTPIDWAKENGHEDIAELLEK
>T=0.1, sample=2, score=0.8428, global_score=0.8428, seq_recovery=0.4078
SAAELARRAAAAARRAEALARAAELVAAAREAARKRREERERRREEERRREEELRRRLEELRKLLEELRKLLE/ADGRDPNARDPTGRTPLHEAAREGDLERVIELLEAGADVNARDKTGRTPLHLAAENGHLEIVVVLLAAGADVNAKDKTGRTPLHLAAENGHLDIAEALLAAGADVNAKDKTGETPIDLAKKAGHEEIAELLEK
>T=0.1, sample=3, score=0.8401, global_score=0.8401, seq_recovery=0.3981
SLAEALRRAACAAAAAAALAEAEELIRRTREEAEEERRRRAAERARRAAEAAAAAAKAAAAAAAEAAAAAAAA/CDGTDVNAADESGNTPLHEAARSGDVERVVLLLAAGADVNAKDKNGNTPLHWAARNGHLEIVELLLKAGADVNAKDKNGNTPLHWAARNGHLEIVKLILKAGGDVNAKDKNGKTPLDWAKEAGHEEIAELLEK
>T=0.1, sample=4, score=0.7619, global_score=0.7619, seq_recovery=0.4126
GPAAAARRAAAAAAAAAALAAAEKVVAAARAKAAAAEAERAARRAAEAAAAAAAAAALAAAAAAAAAAALAAA/ADGTDVNAADESGWTPLHEAAKKGDLEEVVRLLAAGADVNAKDKEGWTPLHLAAKNGHLEIVVVLLEAGADVNAKDKEGKTPLHLAAENGHLEIAKAILEAGADVDAKDKEGKTPIDLAKEAGHEEIAKLLEE
>T=0.1, sample=5, score=0.8370, global_score=0.8370, seq_recovery=0.3883
SAEEARRRAEAAARAAAALAEAAAVVAEAEAEAAERRARREAERAERARAAEAAAAAAAAAAEAAAAAAAAAA/ADGTDVNARDASGNTPLHYAAEEGDLERVVELLAAGADVNARNKTGNTPLHLAAENGHLDIVRVLLAAGADVNARNKTGNTPLHLAAKNGHLDIVEVILAAGGDVNAKNKTGETPLDLAKKAGHEEIAKLLEE
roccomoretti commented 1 month ago

The "/" in a fasta usually indicates a chainbreak. It's hard to tell for sure without details about your running parameters, but I'm guessing it indicates the separation between chain A and chain B in your designs.

LinnjiaWen commented 1 month ago

The "/" in a fasta usually indicates a chainbreak. It's hard to tell for sure without details about your running parameters, but I'm guessing it indicates the separation between chain A and chain B in your designs.

Thank you, here is the parameters I used:

python protein_mpnn_run.py --pdb_path test.pdb  --out_folder 02.MPNN/test --num_seq_per_target 10 --sampling_temp 0.1 --seed 003
roccomoretti commented 1 month ago

If your chain A is 73 aa long and your chain B is 133 aa long, then I'm pretty confident in saying that the "/" represents a chainbreak.