monarch-initiative / monarch-ingest

Data ingest application for Monarch Initiative knowledge graph using Koza
https://monarchinitiative.org
15 stars 2 forks source link

QC Stats on merged kgx files #211

Closed putmantime closed 2 years ago

putmantime commented 2 years ago

We would like to report the following metrics:

Two scenarios:

  1. Before we merge in ontologies

  2. After we merge in ontologies

Jupyter notebook that pulls in Nodes and edges files and reports above metrics. Needs to run on the most recent dated directory. This data lives on the monarch-ingest google bucket.

Lets look at google colab notebook for developing this.

kevinschaper commented 2 years ago

As a first pass, let's make a file with these columns

edge_file_name, count of total edges, count of edges with missing subject or object

kevinschaper commented 2 years ago

I added myself to this, I'm working in the PR to connect the report to the rest of the code

victoriasoesanto commented 2 years ago

@kevinschaper I have filled out the functions that we wrote last week but i have not updated the PR, would you like me to update it first before you connect the rest of the code?