tanghaibao / goatools

Python library to handle Gene Ontology (GO) terms
BSD 2-Clause "Simplified" License
745 stars 212 forks source link

What's the point of removing genes with identical identifiers? #271

Open shelkmike opened 1 year ago

shelkmike commented 1 year ago

I do the following analysis: I have a complete list of genes from a modern genome. Also, I have a complete list of genes from an ancestor of this species (computationally reconstructed) that lived a million years ago. I perform a GO term enrichment analysis between these two gene sets using the option "--compare". GOATOOLS removes identical gene identifiers from the analysis ("removed 19350 overlapping items"). Why does GOATOOLS do this? Doesn't it make results incorrect?