ml4bio / e2efold

pytorch implementation for "RNA Secondary Structure Prediction By Learning Unrolled Algorithms"
MIT License
106 stars 17 forks source link

Prediction for user's input sequence #2

Closed jaswindersingh2 closed 4 years ago

jaswindersingh2 commented 4 years ago

Hi,

Can you please make this algorithm to predict for user's input sequence? At the moment, it just works on the datasets that were used in the e2efold paper. It will be great if you can make this algorithm to give output in any standard RNA secondary structure format (like .dbn or .ct) for user's input sequence.

Thank you

liyu95 commented 4 years ago

Hi, Thank you very much for your suggestion! Yes, we will implement that part and upload that part soon.

liyu95 commented 4 years ago

For your information, we have uploaded part of the productive code. For the sequences longer than 600, we will upload the code by the end of this month. We are overwhelmed with deadlines, meetings, and interviews recently.

jaswindersingh2 commented 4 years ago

Hi

Thanks for your efforts to make it work for the user's input sequence. But when I run it on example sequence B01865.seq, the output .ct file is not showing any base-pair. I have attached the prediction file for the B01865.seq input sequence. On doing some debugging, it seems like on line 129, the maximum predicted base probability was 0.2223, therefore on comparing with threshold > 0.5 no base-pair is detected. Can you please fix this issue? B01865.seq.ct.txt

Thank you

liyu95 commented 4 years ago

Hi,

Could you double-check whether the model is load correctly? You need to download model files first and put them in the model folder. In the shell output, you should have a line showing: 'Loading e2e model...'

jaswindersingh2 commented 4 years ago

Hi

Thanks for the quick reply. Yeah, it was an error regarding model loading. Downloading the model files again and solved the problem. It is working perfectly now. Again, thank you very much.