monarch-initiative / monarch-ingest

Data ingest application for Monarch Initiative knowledge graph using Koza
https://monarchinitiative.org
14 stars 1 forks source link

QC Review for Monarch Graph #313

Closed putmantime closed 1 year ago

putmantime commented 2 years ago

Background KGX outputs merged_graph_stats.yaml which is a report of counts by model and source.

Cat-Merge has its own qc report.

We initially used the KGX report, when we transitioned to Cat-Merge we moved to our own, ad-hoc, reporting code.

We can take the merged kgx files from Cat-Merge and run the KGX report command to create the KGX report. We need to use that schema as a base to standardize dashboard code.

Next steps:

  1. Review the KGX report, review the cat-merge report and the code that generates it.
  2. Design an extension for the KGX reports that includes the stats from the Cat-Merge report that are not in KGX but needed for Monarch.

Helpful Links:

  1. KGX QC Report
  2. Cat-Merge Code
  3. Example Dashboard from KG-COVID-19
  4. Cat-Merge QC Report
  5. SOLR Dashboard
putmantime commented 2 years ago

More relevant links:

putmantime commented 2 years ago

Workflow:

  1. Cat-Merge will import stats.py from KG OBO stats.py and generate high level topology stats
  2. Cat-Merge will import summarize_graph.py from KGX and generate the summary report.
  3. Merge reports from 1&2 into a single report
  4. We will then compare our own Cat-Merge graph report to the content of the merged report from step 3.
  5. Add any features needed from the Cat-Merge report code.
  6. Cat-Merge script to diff two reports (from two different releases)
  7. Initial dashboard to present diffs between graph entities.
amc-corey-cox commented 1 year ago

Imported portions from KG OBO and KGX relevant to Monarch KG

KG OBO/grape removed with: https://github.com/monarch-initiative/cat-merge/pull/38