onc-healthit / onc-certification-g10-test-kit

ONC Certification (g)(10) Standardized API Tests
Apache License 2.0
34 stars 12 forks source link

terminology:check_built_terminology error (FI-1510) #61

Closed alexrothenberg closed 2 years ago

alexrothenberg commented 2 years ago

After creating the terminology validators using the docker instructions at https://github.com/onc-healthit/onc-certification-g10-test-kit#terminology-support I get 2 red errors in the UI and when I check with the rake task I aksi see the same 2 errors:

$ bundle exec rake terminology:check_built_terminology
/Users/alexrothenberg/.rbenv/versions/2.7.3/lib/ruby/gems/2.7.0/gems/pry-byebug-3.8.0/lib/pry-byebug/control_d_handler.rb:5: warning: control_d_handler's arity of 2 parameters was deprecated (eval_string, pry_instance). Now it gets passed just 1 parameter (pry_instance)
Terminology build results different than expected.
http://hl7.org/fhir/us/core/ValueSet/simple-language: Expected codes: 8212 Actual codes: 8239
urn:ietf:bcp:47: Expected codes: 8533 Actual codes: 8560

I can make the warning go away by changing 2 of the counts in the expected_manifest.yml file

diff --git a/lib/inferno/terminology/expected_manifest.yml b/lib/inferno/terminology/expected_manifest.yml
index b8ad34d..83afff8 100644
--- a/lib/inferno/terminology/expected_manifest.yml
+++ b/lib/inferno/terminology/expected_manifest.yml
@@ -556,7 +556,7 @@
   - http://terminology.hl7.org/CodeSystem/v2-0131
 - :url: http://hl7.org/fhir/us/core/ValueSet/simple-language
   :file: hl7_org_fhir_us_core_ValueSet_simple-language.msgpack
-  :count: 8212
+  :count: 8239
   :type: bloom
   :code_systems:
   - urn:ietf:bcp:47
@@ -668,7 +668,7 @@
   - http://hl7.org/fhir/provenance-entity-role
 - :url: urn:ietf:bcp:47
   :file: urn_ietf_bcp_47.msgpack
-  :count: 8533
+  :count: 8560
   :type: bloom
   :code_systems: urn:ietf:bcp:47
 - :url: http://terminology.hl7.org/CodeSystem/allergyintolerance-clinical

I'm not sure if the UMLS data at the NIH has changed so the expectations should be updated or if the data I've downloaded into the msgpack files is incorrect.

I'd appreciate any suggestions or ideas.

Thanks

Jammjammjamm commented 2 years ago

These are sets of codes that live outside of UMLS, and as such can change unexpectedly. As a result, we need to relax how we check their validity.

So, this does not indicate a problem with your terminology build, and I am currently working on a fix which will be included in the next release.

alexrothenberg commented 2 years ago

Thanks for the update & good luck with the fix you're working on.

Out of curiosity is there documentation of what terminologies are and how they're used. The instructions for generating in the readme are very clear but I couldn't find a description of what they are or which tests rely on them. I don't think it existing in the 1.9 version of Inferno so its new to us.

Jammjammjamm commented 2 years ago

It's basically all of the terminologies used by US Core. Terminology validation is performed as part of the resource validation tests (tests with titles like ___ resources returned during previous tests conform to the ___ profile).

The expected manifest file that you found lists them all. These were used in 1.9, as well. If you go to the hosted version of 1.9 and click on Version 1.9.0 in the bottom right, you will see a similar list of terminologies and code counts. We are working to add a similar display to the new version.

alexrothenberg commented 2 years ago

makes sense. Thanks again for the info & help