dauparas / ProteinMPNN

Code for the ProteinMPNN paper
MIT License
934 stars 284 forks source link

Abnormal results #25

Closed zdf1122 closed 1 year ago

zdf1122 commented 1 year ago

hello! I found that my result concludes some abnormal amino acid X, I know that it means undefined amino acid. But how did this happen? When I run the sample, the result is normal. (length of my protein is 1366 aa ) image

dauparas commented 1 year ago

hello! does the top sequence in the .fasta output file also have those Xs? Could it be that your PDB file has residue jumps in those places?

zdf1122 commented 1 year ago

hello! does the top sequence in the .fasta output file also have those Xs? Could it be that your PDB file has residue jumps in those places? thank you very much !

zdf1122 commented 1 year ago

hello! does the top sequence in the .fasta output file also have those Xs? Could it be that your PDB file has residue jumps in those places?

Sorry to bother you again. I wonder if whether the higher the score, global_score and seq_recovery, the better the result. image

dauparas commented 1 year ago

Lower score is better; the score (negative log probability) represents model's uncertainty about the predictions.

tony-res commented 1 year ago

I've got a similar output. From what I can tell the original PDB did have residue jumps that were labeled as "-" in pyMol and appear to come out as "X" in the ProteinMPNN output.

If I wanted to put this sequence into something like AlphaFold or ESM, then would I simply remove the "X"s? Or is there more to it?

e.g. "SEQVXXXKIXM" becomes "SEQVKIM"