timsbiomed / issues

TIMS issue tracker.
https://github.com/orgs/timsbiomed/projects/9/views/1
1 stars 0 forks source link

Requirement: Content: CodeSystems #4

Open joeflack4 opened 2 years ago

joeflack4 commented 2 years ago

Sub-issues

Sub-tasks

Task Details

1. Get files to import that are not from OMOP, and upload to GoogleDrive.

HAPI supports the native .zip formats of ICD10CM, SNOMED-CT, and LOINC, I believe.

For all other code systems, we will need to construct a .zip file with 2-3 particular CSVs. The docs should have more information about that, but Shahim is also somewhat familiar.

2a. Evaluate means of upload

Useful resources

3. Additional configuration

Shahim 2022/02/17:

Other relevant stuff we'll probably need to deal with at some point to really have a FHIR terminology server that works as tx.fhir.org: https://hapifhir.io/hapi-fhir/docs/server_plain/customizing_the_capabilitystatement.html http://hl7.org/fhir/r4/terminologycapabilities.html

tx.fhir.org is not HAPI but I'm tracing how to reproduce what tx.fhir.org does. See this as an example of what we need to reproduce: http://tx.fhir.org/r4/metadata?_format=json&mode=terminology

This is what my default HAPI shows: http://20.119.216.32:8888/pheno1r4/fhir/metadata?mode=terminology&_format=json loaded code systems don't show up by default in the TerminologyCapabilities statement. I'm guessing we need to do HAPI customizations with interceptors, as the doc page describes

Joe 2022/02/17: Some differences I noticed: http://tx.fhir.org/r4/metadata?_format=json&mode=terminology

{
  "resourceType": "TerminologyCapabilities",
  "id": "FhirServer",
  "url": "http://fhir.healthintersections.com.au/open/metadata",
  "version": "1.0.0",
  "name": "FHIR Reference Server Teminology Capability Statement",
  "description": "Standard Teminology Capability Statement for the open source Reference FHIR Server provided by Health Intersections",
  "codeSystem": [
    {
      "uri": "http://hl7.org/fhir/sid/icd-10-cm"
    },

http://20.119.216.32:8888/pheno1r4/fhir/metadata?mode=terminology&_format=json

{
  "resourceType": "CapabilityStatement",
  "name": "RestServer",
  "software": {
    "name": "HAPI FHIR Server",
    "version": "5.6.0"
  },
  "implementation": {
    "description": "HAPI FHIR R4 Server",
    "url": "http://20.119.216.32:8888/pheno1r4/fhir"
  },
  "fhirVersion": "4.0.1",

Notes, by CodeSystem

HPO

See: #1

CPT4

Instructions taken from: https://github.com/National-COVID-Cohort-Collaborative/Data-Ingestion-and-Harmonization/wiki/OMOP-Vocabulary-Updates

The CPT-4 vocabulary is reconstituted to the concept table manually and updated. See below for details.

  1. unpack the zip files.
  2. After unpacking, simply open a command line in the directory you unpacked all the files into and run "java -Dumls-user=xxx -Dumls-password=xxx -jar cpt4.jar 5". Please replace "xxx" with UMLS username and password. (***Note, this step will update the CONCEPT.csv file by merging the CPT4 codes. It is a good practice to create a backup copy of the CONCEPT.csv before running this utility. )
  3. Backup the OMOP vocabulary tables.
  4. Truncate the OMOP vocabulary table.
  5. Load the newly downloaded/ updated *.csv files into the OMOP vocabulary tables. Important: All vocabularies are fully represented in the downloaded files with the exception of CPT-4: OHDSI does not have a distribution license to ship CPT-4 codes together with the descriptions. Therefore, CPT4 vocabularies are updated manually by logging into the UMLS site. And download the CPT4 codes directly from the UMLS site and merge the update.

Resources

Davera wrote:

Reference terminology sources I'll leave some notes / links here re: terminology "sources of truth." Maybe there will be other reference terminology resources we can add to this issue. But I wanted anchor some comments Ive been making in some endpoints for you folks to take a look.

Related

joeflack4 commented 2 years ago

Davera cautioned me against using Athena OHDSI OMOP CSVs as our source vocabularies. This is because it is lossy. There are changes they made and not all information from source vocabularies are there.

Siggie mentioned it might be a good idea to upload 2 versions of code systems, just for experimentation / knowledge gaining purposes: an OHDSI version, and the original version. Then we can see the differences and might learn something valuable.

joeflack4 commented 2 years ago

For reference, I was able to use the hapi-fhir-cli to upload ICD10CM. But not for LOINC and SNOMED.

I posted this issue in the hapi-fhir repo, but no response yet: https://github.com/hapifhir/hapi-fhir/issues/3276

SNOMED

make upload-snomed
hapi-fhir-cli upload-terminology -d ./data/terminologies/SNOMED/SnomedCT_USEditionRF2_PRODUCTION_20210901T120000Z.zip -v r4 -t http://20.119.216.32:8080/fhir -u http://snomed.info/sct
------------------------------------------------------------
🔥  HAPI FHIR 5.6.0 - Command Line Tool
------------------------------------------------------------
Process ID                      : 85779@EB-IM-R0KHMD6R.local
Max configured JVM memory (Xmx) : 8.0GB
Detected Java version           : 17.0.1
------------------------------------------------------------
2022-02-16 18:56:47.343 [main] INFO  ca.uhn.fhir.cli.App Logging configuration set from file logback-cli-on.xml
2022-02-16 18:56:48.67 [main] INFO  c.u.f.c.UploadTerminologyCommand Adding ZIP file: ./data/terminologies/SNOMED/SnomedCT_USEditionRF2_PRODUCTION_20210901T120000Z.zip
2022-02-16 18:56:49.552 [main] INFO  c.u.f.c.UploadTerminologyCommand File size is greater than 10 MB - Going to use a local file reference instead of a direct HTTP transfer. Note that this will only work when executing this command on the same server as the FHIR server itself.
2022-02-16 18:56:51.599 [main] INFO  c.u.f.c.UploadTerminologyCommand Beginning upload - This may take a while...
2022-02-16 18:56:52.388 [main] ERROR c.u.f.c.UploadTerminologyCommand Received the following response:
{
  "resourceType": "OperationOutcome",
  "text": {
    "status": "generated",
    "div": "<div xmlns=\"http://www.w3.org/1999/xhtml\"><h1>Operation Outcome</h1><table border=\"0\"><tr><td style=\"font-weight: bold;\">ERROR</td><td>[]</td><td><pre>Unknown file: hapi-fhir-cli3131347798514327148.zip</pre></td>\n\t\t\t</tr>\n\t\t</table>\n\t</div>"
  },
  "issue": [ {
    "severity": "error",
    "code": "processing",
    "diagnostics": "Unknown file: hapi-fhir-cli3131347798514327148.zip"
  } ]
}
2022-02-16 18:56:52.391 [main] ERROR ca.uhn.fhir.cli.App Error during execution:
ca.uhn.fhir.rest.server.exceptions.InvalidRequestException: HTTP 400 : Unknown file: hapi-fhir-cli3131347798514327148.zip
    at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
    at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:480)
    at ca.uhn.fhir.rest.server.exceptions.BaseServerResponseException.newInstance(BaseServerResponseException.java:305)
    at ca.uhn.fhir.rest.client.impl.BaseClient.invokeClient(BaseClient.java:351)
    at ca.uhn.fhir.rest.client.impl.GenericClient$BaseClientExecutable.invoke(GenericClient.java:540)
    at ca.uhn.fhir.rest.client.impl.GenericClient$OperationInternal.execute(GenericClient.java:1319)
    at ca.uhn.fhir.cli.UploadTerminologyCommand.invokeOperation(UploadTerminologyCommand.java:216)
    at ca.uhn.fhir.cli.UploadTerminologyCommand.run(UploadTerminologyCommand.java:123)
    at ca.uhn.fhir.cli.BaseApp.run(BaseApp.java:253)
    at ca.uhn.fhir.cli.App.main(App.java:43)
2022-02-16 18:56:52.392 [Thread-0] INFO  ca.uhn.fhir.cli.App HAPI FHIR is shutting down...
make: *** [upload-snomed] Error 1

LOINC

make upload-loinc
hapi-fhir-cli upload-terminology -d ./data/terminologies/LOINC/Loinc_2.71_MultiAxialHierarchy_3.5.zip -d /Users/joeflack4/projects/hapi-fhir-jpaserver-starter/data/terminologies/LOINC/_archive/_Full/Loinc_2.71.zip -v r4 -t http://20.119.216.32:8080/fhir -u http://loinc.org
------------------------------------------------------------
🔥  HAPI FHIR 5.6.0 - Command Line Tool
------------------------------------------------------------
Process ID                      : 85841@EB-IM-R0KHMD6R.local
Max configured JVM memory (Xmx) : 8.0GB
Detected Java version           : 17.0.1
------------------------------------------------------------
2022-02-16 18:57:08.12 [main] INFO  ca.uhn.fhir.cli.App Logging configuration set from file logback-cli-on.xml
2022-02-16 18:57:08.740 [main] INFO  c.u.f.c.UploadTerminologyCommand Adding ZIP file: ./data/terminologies/LOINC/Loinc_2.71_MultiAxialHierarchy_3.5.zip
2022-02-16 18:57:08.751 [main] INFO  c.u.f.c.UploadTerminologyCommand Adding ZIP file: /Users/joeflack4/projects/hapi-fhir-jpaserver-starter/data/terminologies/LOINC/_archive/_Full/Loinc_2.71.zip
2022-02-16 18:57:10.320 [main] INFO  c.u.f.c.UploadTerminologyCommand File size is greater than 10 MB - Going to use a local file reference instead of a direct HTTP transfer. Note that this will only work when executing this command on the same server as the FHIR server itself.
2022-02-16 18:57:11.642 [main] INFO  c.u.f.c.UploadTerminologyCommand Beginning upload - This may take a while...
2022-02-16 18:57:13.612 [main] ERROR c.u.f.c.UploadTerminologyCommand Received the following response:
{
  "resourceType": "OperationOutcome",
  "text": {
    "status": "generated",
    "div": "<div xmlns=\"http://www.w3.org/1999/xhtml\"><h1>Operation Outcome</h1><table border=\"0\"><tr><td style=\"font-weight: bold;\">ERROR</td><td>[]</td><td><pre>Unknown file: hapi-fhir-cli10749622764325811546.zip</pre></td>\n\t\t\t</tr>\n\t\t</table>\n\t</div>"
  },
  "issue": [ {
    "severity": "error",
    "code": "processing",
    "diagnostics": "Unknown file: hapi-fhir-cli10749622764325811546.zip"
  } ]
}
2022-02-16 18:57:13.613 [main] ERROR ca.uhn.fhir.cli.App Error during execution:
ca.uhn.fhir.rest.server.exceptions.InvalidRequestException: HTTP 400 : Unknown file: hapi-fhir-cli10749622764325811546.zip
    at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
    at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:480)
    at ca.uhn.fhir.rest.server.exceptions.BaseServerResponseException.newInstance(BaseServerResponseException.java:305)
    at ca.uhn.fhir.rest.client.impl.BaseClient.invokeClient(BaseClient.java:351)
    at ca.uhn.fhir.rest.client.impl.GenericClient$BaseClientExecutable.invoke(GenericClient.java:540)
    at ca.uhn.fhir.rest.client.impl.GenericClient$OperationInternal.execute(GenericClient.java:1319)
    at ca.uhn.fhir.cli.UploadTerminologyCommand.invokeOperation(UploadTerminologyCommand.java:216)
    at ca.uhn.fhir.cli.UploadTerminologyCommand.run(UploadTerminologyCommand.java:123)
    at ca.uhn.fhir.cli.BaseApp.run(BaseApp.java:253)
    at ca.uhn.fhir.cli.App.main(App.java:43)
2022-02-16 18:57:13.614 [Thread-0] INFO  ca.uhn.fhir.cli.App HAPI FHIR is shutting down...
make: *** [upload-loinc] Error 1
joeflack4 commented 2 years ago

Apparently, the ICD10CM upload didn't include concepts, for some reason: http://20.119.216.32:8080/fhir/CodeSystem/1?_format=json

{
  "resourceType": "CodeSystem",
  "id": "1",
  "meta": {
    "versionId": "1",
    "lastUpdated": "2022-02-16T23:56:06.161+00:00",
    "source": "#hoo7Oyh75NdyGFOP"
  },
  "url": "http://hl7.org/fhir/sid/icd-10-cm",
  "version": "2021",
  "name": "ICD-10-CM",
  "status": "active",
  "content": "not-present"
}
joeflack4 commented 1 year ago

http://fhir.org/guides/stats/ - Terminology IG / server index. Good source of IGs and content that we might want to load into TIMS server. May include some of what we could use for this issue.

richard1933 commented 1 year ago

these are in html code, wonder if there is a way to parse them out, i.e. http://fhir.org/guides/stats/valueset-hl7.fhir.uv.saner-allcovid19riskfactors.html seems good fix for concept set

joeflack4 commented 1 year ago

Yeah, definitely parseable using some screen scraping. But I imagine that this ValueSet is being fetched from some FHIR endpoint.