realbigws / Predict_Property

Predict protein local properties using sequence or profile information.
GNU General Public License v3.0
22 stars 4 forks source link

Disordered region prediction using this package is different from the prediction using the RaptorX web server #1

Closed DinghaiZheng closed 3 years ago

DinghaiZheng commented 3 years ago

I tried to predict disordered regions in the following input protein sequence (saved in my_seq.fasta):

>my_seq
DIQMTQSPSSLSASVGDRVTITCRASQDVNTAVAWYQQKPGKAPKLLIYSASFLYSGVPSRFSGSRSGTDFTLTISSLQPEDFATYYCQQHYTTPPTFGQGTKVEIKRTVAAPSVFIFPPSDEQLKSGTASVVCLLNNFYPREAKVQWKVDNALQSGNSQESVTEQDSKDSTYSLSSTLTLSKADYEKHKVYACEVTHQGLSSPVTKSFNRGEC

The command I used was: ./Predict_Property.sh -i my_seq.fasta

In one of the output files (my_seq.diso), we can see that the positions 156~160 have high probabilities to be disordered.

 156 S * 0.993 
 157 G * 0.998 
 158 N * 0.999 
 159 S * 0.999 
 160 Q * 0.999 

However, if I use the RaptorX web server to do the prediction using the same input sequence, the predicted disorder probability is much lower:

 156 S . 0.456 
 157 G * 0.511 
 158 N . 0.500 
 159 S . 0.421 
 160 Q . 0.307 

Based on what we know about my_seq, the prediction from the RaptorX web server is more likely to be correct. Could you check why the predictions are different?

Thank you very much!

realbigws commented 3 years ago

Hi, Dinghai.

I guess that you're using the "single sequence" version of Predict_Property to predict the order/disorder regions.

If you switch to the "profile" version (i.e., first search for MSA and then convert it to a TGT file), then you would probably obtain a similar result as that from RaptorX_WebServer.

Best, -Sheng

On Sat, Jul 24, 2021 at 4:48 AM DinghaiZheng @.***> wrote:

I tried to predict disordered regions in the following input protein sequence (saved in my_seq.fasta):

my_seq

DIQMTQSPSSLSASVGDRVTITCRASQDVNTAVAWYQQKPGKAPKLLIYSASFLYSGVPSRFSGSRSGTDFTLTISSLQPEDFATYYCQQHYTTPPTFGQGTKVEIKRTVAAPSVFIFPPSDEQLKSGTASVVCLLNNFYPREAKVQWKVDNALQSGNSQESVTEQDSKDSTYSLSSTLTLSKADYEKHKVYACEVTHQGLSSPVTKSFNRGEC

The command I used was: ./Predict_Property.sh -i my_seq.fasta

In one of the output files (my_seq.diso), we can see that the positions 156~160 have high probabilities to be disordered.

156 S 0.993 157 G 0.998 158 N 0.999 159 S 0.999 160 Q * 0.999

However, if I use the RaptorX web server http://raptorx.uchicago.edu/StructurePropertyPred/predict/ to do the prediction using the same input sequence, the predicted disorder probability is much lower:

156 S . 0.456 157 G * 0.511 158 N . 0.500 159 S . 0.421 160 Q . 0.307

Based on what we know about my_seq, the prediction from the RaptorX web server is more likely to be correct. Could you check why the predictions are different?

Thank you very much!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/realbigws/Predict_Property/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACD6EWLXPCLM62LA45NF7ITTZHIRHANCNFSM5A4VOBKA .

DinghaiZheng commented 3 years ago

Sheng,

After following your suggestion, the disordered region prediction using the command line is identical to the result from the webserver.

Thank you very much for the software and suggestion!

Best, Dinghai