cultivarium / GenomeSPOT

Predict oxygen, temperature, salinity, and pH preferences of bacteria and archaea from a genome
https://cultivarium.org/
MIT License
27 stars 1 forks source link

How to interpret "not intolerant" for oxygen #6

Open ilnamkang opened 3 months ago

ilnamkang commented 3 months ago

Hi,

Thank you for a great tool.

I gave it a try with several genomes.

One of the genomes was predicted to be "not intolerant" for oxygen, but I cannot find how to interpret "not intolerant" for oxygen in this site or the preprint.

How is "not intolerant" for oxygen supposed to be interpreted?

tylerbarnum commented 3 months ago

Thanks for raising this issue. I'm adding this description to the README. Does it make sense to you, or would you like more explanation?

For oxygen, the predicted value is a classification of tolerance as "tolerant" or "not tolerant". In model training, an organism was defined as "tolerant" if it was either an aerotolerant anaerobe, microaerophile, facultative aerobe, facultative anaerobe, or obligate aerobe, and an organism was defined as "intolerant" if it appeared to be an obligate anaerobe, i.e. described only as an anaerobe or obligate anaerobe.

ilnamkang commented 3 months ago

Thank you for your quick reply.

But, please note that the genome was predicted to be "not intolerant" (neither "tolerant" nor "not tolerant"), which makes me confused. I attached the capture of the result file below.

prediction

tylerbarnum commented 3 months ago

Oh, that's a typo. Thank you for following up and clarifying the mistake on our part. This issue will be resolved by PR #7. In the meantime, be aware that "not intolerant" should have read "not tolerant."

Separately, you may be interested to know that a probability of 0.75 is relatively low for oxygen classifications. If you're interested in seeing the distribution of probabilities, I recommend supplementary Figure 9C. ~80% of anaerobes are assigned a probability a being not tolerant >0.90 (corresponding to <0.10 probability of being tolerant in the plot).

ilnamkang commented 3 months ago

Thank you for your advice regarding the interpretation of oxygen classification.

tylerbarnum commented 3 months ago

I am leaving this open so users will discover this issue and pull the latest version of GenomeSPOT (release >=v1.0.1).