sba1 / ontologizer

Ontologizer is a tool for identifying enriched Gene Ontology terms in lists of names of genes or gene products.
http://ontologizer.de
8 stars 8 forks source link

[Suggestion] add ignore obsolete functionality #8

Open johntiger1 opened 7 years ago

johntiger1 commented 7 years ago

I am not sure if the code currently has this functionality, but I think others might appreciate an option to ignore obsolete terms. For example, in my research, I choose to ignore obsolete terms because they are unnecessary. (The way I currently do this is a hacky and inefficient way where I loop through the term map and manually remove if it is obsolete).

Also, this is only a suggestion, since I feel like we can ignore obsoletes at a lower level (i.e. at the file stream level) and save some time there.

sba1 commented 7 years ago

Thanks a lot for your patch. Please note that the core of the ontologizer (including the io stuff) has been moved to a dedicated project that you can find at https://github.com/ontologizer/ontologizerlib. While the original Ontologizer does not take advantage of it, it certainly will do at some time in the future. I suggest therefore to write against that project.

I also suggest creating a new flag similar to the other settings of the parser (e.g., PARSE_OBSOLETES or something like this) instead of relying on inheritance.

As a side note, it may be important to know also obsolete terms because an older annotation file may still refer to an now-obsolete term. While such annotations cannot be used, this information can be used to produce a particular warning, e.g., one may suggest to try a newer annotation file. Therefore the feature should be optional (as you have implemented it).