monarch-initiative / monarch-ingest

Data ingest application for Monarch Initiative knowledge graph using Koza
https://monarchinitiative.org
15 stars 2 forks source link

Add tsv file to enumerate the species we will include in graph #212

Closed kevinschaper closed 1 year ago

kevinschaper commented 2 years ago

At a minimum, it should include the taxon ID and species name

Separately, we'll want to add support in Koza for filtering based on a column in a file, and maybe also downloader support for injecting values into parameterized URL to automatically get the files we need for each species in the file.

The initial list of taxon ids include: 9606 10116 10090 7955 8355 7227 6239 4932 9615 9913 9823 9031 44689 162425 4896

https://docs.google.com/spreadsheets/d/1xIguK6fYZiMPpkf7dF4TwudnkQoeQsGWAvbtKq7BZ7U/edit#gid=0 includes at least a common name for each species, that might be enough for now.

kevinschaper commented 1 year ago

For now I think this is going to remain distributed across the download & ingest configuration files. It's hard to see a practical way to make it reusable.