monarch-initiative / mondo-ingest

Coordinating the mondo-ingest with external sources
https://monarch-initiative.github.io/mondo-ingest/
6 stars 3 forks source link

TEST Build from main branch with code updates to icd11 component goal and problematic exclusion script #637

Closed twhetzel closed 3 months ago

twhetzel commented 3 months ago

Resolves #ISSUE(s).

Overview

This PR: Includes code updates to:

This PR also includes data updates based on a full build of the mondo-ingest pipeline run as: sh run.sh make build-mondo-ingest -B 2>&1 | tee logs.txt

NOTE: Keep in mind this is the first data build since icd11.foundation was included in "mondo.sssom.tsv" following the inclusion of xrefs to icd11.foundation for the August Mondo release and subsequent updates in the mondo repo to make sure the icd11.foundation xrefs were included in the "mondo.sssom.tsv" file.

PROPOSAL: Given this very special situation where this is Week 2 (Data) of the Mondo Release Cycle where lexical alignments are reviewed and generally new builds are done and that this hotfix code change is needed to fix the build on main, I suggest that if we approve this PR and it's ok with Sabrina that both the code and data files be merged into main. @matentzn and @twhetzel will discuss Wednesday.

Pre-merge checklist

Documentation

Was the documentation added/updated under docs/?

QC

Was the full pipeline run before submitting this PR using sh run.sh make build-mondo-ingest on this branch (after docker pull obolibrary/odkfull:dev), and no errors occurred?

New Packages

Were any new Python packages added?

Were any other non-Python packages added?

PR Review and Conversations Resolved

Has the PR been sufficiently reviewed by at least 1 team member of the Mondo Technical team and all threads resolved?

twhetzel commented 3 months ago

Nico and I discussed the change in this PR and we decided to go with the fix included in this PR. We refactored the SPARQL query and as a test I removed the icd11.foundation reports (icd11foundation_excluded_terms_in_mondo_xrefs.tsv and icd11foundation_excluded_terms_in_mondo_xrefs_summary.tsv) and then re-ran that specific goal as: sh run.sh make reports/icd11foundation_excluded_terms_in_mondo_xrefs.tsv -B and the reports were re-generated without any changes so another full build is not needed.

We also decided that in this special case it is ok to merge both this code hotfix and any data changes into main since it is Week 2 - Build Week of the Mondo Release Cycle.