obophenotype / ncbitaxon

Build for NCBITaxon
BSD 3-Clause "New" or "Revised" License
24 stars 7 forks source link

Enhance Python code for subset extraction #34

Open jamesaoverton opened 4 years ago

jamesaoverton commented 4 years ago

The current code in #30 can be configured to generate only classes with a given set of tax_ids. By adding one more pass through nodes.dmp we could find all the ancestors of a given set of tax_ids, then generate a subset from that. I expect that would be more efficient than extracting from .obo like this https://github.com/obophenotype/ncbitaxon/blob/master/subsets/Makefile#L8