arkhn / jpaltime

Apache License 2.0
0 stars 0 forks source link

Improve modelization and text-search of DocumentReference text content #20

Closed simonvadee closed 2 years ago

simonvadee commented 3 years ago

Problem

We currently use a custom extension documentreference-raw-text to index the text content of a document but we want to use DocumentReference.attachment.data instead. Using a custom extension is an interoperability issue.

Description

As described in arkhn/Cohort360#97 we want to change the FHIR attribute where the text content of a document is stored. Using the custom extension, the text is currently stored in DocumentReference.extension[0].valueString of type string which allows it to be indexed in ES through myContentText of the ResourceTable dataclass (see parseContentTextIntoWords in hapi-fhir-jpaserver-base/src/main/java/ca/uhn/fhir/jpa/dao/BaseHapiFhirDao.java). We want to use the DocumentReference.attachment.data (which is appropriate according to the FHIR model) but of type base64binary which is not indexed in ES and therefore prevents us from running full-text searches on this field.

Alternatives

Don't know yet, I hope we can avoid forking but I don't see it yet.

Additional context

Implementation

simonvadee commented 2 years ago

We don't have a use case for this anymore. This issue may be reopened if we decide to re-investigate but we'll probably implement this some other way.