Knowledge-Graph-Hub / kg-microbe

https://knowledge-graph-hub.github.io/kg-microbe/index.html
BSD 3-Clause "New" or "Revised" License
16 stars 3 forks source link

Ingest gutMEGA #33

Open realmarcin opened 3 years ago

realmarcin commented 3 years ago

The gutMEGA resource provides a few useful files for KG-Microbe. http://gutmega.omicsbio.info/download.php

The first set of ingests cover different microbial taxonomy resources, including NCBITaxonomy which is already present in KG-Microbe. Looking at these text files, there may be some alignment necessary between these different taxonomies, ideally as an NER task. There will be some disagreements in taxonomy structure, and we even pick clique leaders. Note that the three non-NCBI taxonomies are all specific to microbes (as opposed to NCBI, that is why we had to trim).

NCBI taxonomy | Reformatted NCBI taxonomy information, including diifferent ranks of NCBI taxa | Greengenes taxonomy | Reformatted Greengenes taxonomy information, including diifferent ranks of Greengenes taxa |   RDP taxonomy | Reformatted RDP taxonomy information, including diifferent ranks of RDP taxa |   SILVA taxonomy | Reformatted SILVA taxonomy information, including diifferent ranks of SILVA taxa |  

Quantitative data table ingest, this will be valuable and to start can be an NER task (but need to identify reference ontology set). These will be taxa -> condition -> relative abundance, where the 'condition' is a free text short description like a sample title. gutMEGA data table | All quantification events provided in gutMEGA |  

This dataset provides the literature provenance for the quantitative data in the data table (above): Literature information summary | Related information about the curated literature