nerve-bio / NERVE

NERVE is an user-friendly software environment for the in silico identification of the best vaccine candidates from whole proteomes of bacterial pathogens. The purpose of this project is to update it, implementing new modules with machine learning based methods, and improving the performance of the already implemented ones.
MIT License
5 stars 5 forks source link

Adjust psortb parser #16

Open FranceCosta opened 2 years ago

FranceCosta commented 2 years ago

Control that the psortb parser reposrts the correct predictio. Unknown should be outputed for values <~ 7.5

Ayman344 commented 11 months ago

May I know what is the psortb database that we are extracting the output from? Is it https://www.psort.org/psortb/? I mean if I run a fasta file consisting for protien sequence as an input in the web psort.org and run the same fasta file converting into a sequence database i.e. creating a database of protein sequences that mimics a proteome so that it can be used with NERVE. Won't I get the same result?

francescopatane96 commented 5 months ago

May I know what is the psortb database that we are extracting the output from? Is it https://www.psort.org/psortb/? I mean if I run a fasta file consisting for protien sequence as an input in the web psort.org and run the same fasta file converting into a sequence database i.e. creating a database of protein sequences that mimics a proteome so that it can be used with NERVE. Won't I get the same result?

yes, you should get the identical result

FranceCosta commented 5 months ago

Well, more or less. We parse the output to simplify it. We report only the localization with the highest score, provided it is above 3. For proteins with scores below this treshold, we just report "Unknown" as localization. Note that the localization is considered by the select module only for scores above 7.5.