tanghaibao / goatools

Python library to handle Gene Ontology (GO) terms
BSD 2-Clause "Simplified" License
743 stars 211 forks source link

Keeping terms not in DAG #291

Open Prunoideae opened 2 months ago

Prunoideae commented 2 months ago

Thank you for your great tool!

I'm using EggNOG mapper to annotate my non-model organisms for subsequent enrichment analysis. The mapper outputs GOs for each gene and other terms in KEGG KO, Pathway, etc. These are surely not in the obo, but they would be extremely useful if the enrichment analysis could be applied to them.

However, the current mappers will skip them since they're not in the obo. Is it possible to make the tool able to analyze custom terms without needing a reference file in the future, or should I create a dummy obo file containing all these IDs?

tanghaibao commented 2 months ago

@Prunoideae

Thanks for providing this feedback.

The unclear part is how these terms should be treated, if they aren't in the OBO, then GOATOOLS doesn't know how to interpret them (which relationship it has to others etc.). How would we then perform the statistics test on them?

Prunoideae commented 2 months ago

I think just providing general statistics of over/underrepresentation as a basis would be fine? But if a topology is needed, then making an OBO for KEGGs seems to be a more appropriate idea.