beda-software / FHIRPathMappingLanguage

MIT License
12 stars 1 forks source link

FHIRPathMappingLanguage

Motivation

Data mapping is a high-demand topic. There are many products that try to address it.
Even FHIR provides a specification called FHIR Mapping Language that should cover this gap. Unfortunately, there is a lack of open-source implementation of the FHIR Mapping Language. Furthermore, it is a complicated tool that is hard to create, debug, and manage in along term. Please check real-life examples.

A mapping issue was encountered while implementing an extraction operation for FHIR SDC.
Instead of using the FHIR Mapping Language, an alternative was sought and found in JUTE. It is a powerful engine that provides a nice experience in creating mappers. JUTE is a powerful engine that offers a pleasant experience in creating mappers. Its data DSL nature is a significant advantage, allowing the creation of an FHIR resource with some values replaced by JUTE expressions/directives. Please have a look at this mapper. It is pretty easy to understand what is going on here. Especially if you compare it with FHIR Mapping language version. Unfortunately, JUTE provides its own syntax and approach for path expressions, while it is more convenient to use FHIRPath when you query data from FHIR Resources especially if you are querying QuestionnaireResponse. JUTE provides API to add any function inside the engine, so the fhirpath function was embedded. As a result, you can see that almost all JUTE expression calls fhirpath function: jute.yaml This approach appears to be an overhead, prompting a decision to replace the JUTE path engine with FHIRPath to make it FHIRPath native. A similar approach in the FHIR world is called fhir-xquery, inspired by the liquid template language. Fhir-xquery uses to build dynamic query string. This approach was adopted instead of the $ sign used in JUTE to identify an expression.

Finally, data DSL should be LLM-friendly and there should be an easy way to generate a mapper based on the text description. ChatGPT works pretty well with JSON and FHIRPath. So, you can just copy and paste the specification into ChatGPT and try to generate mappers.

Specification

The FHIRPath mapping language is a data DSL designed to convert data from QuestionnaireResponse (and not only) to any FHIR Resource.

Here is how it works.

Suppose there is a QuestionnaireResponse describing a patient:

{
    "resourceType": "QuestionnaireResponse",
    "status": "completed",
    "item": [
        {
            "text": "Name",
            "linkId": "1",
            "answer": [
                {
                    "valueString": "Ilya"
                }
            ]
        },
        {
            "text": "Birth date",
            "linkId": "2",
            "answer": [
                {
                    "valueDate": "2023-05-03"
                }
            ]
        },
        {
            "text": "gender",
            "linkId": "4.1",
            "answer": [
                {
                    "valueCoding": {
                        "code": "male",
                        "display": "Male",
                        "system": "http://hl7.org/fhir/administrative-gender"
                    }
                }
            ]
        },
        {
            "text": "Phone",
            "linkId": "phone",
            "answer": [
                {
                    "valueString": "+232319898"
                }
            ]
        },
        {
            "text": "email",
            "linkId": "email",
            "answer": [
                {
                    "valueString": "foo@yahoo.com"
                }
            ]
        },
        {
            "text": "country",
            "linkId": "country",
            "answer": [
                {
                    "valueString": "US"
                }
            ]
        }
    ]
}

To map it to a Patient FHIR resource, define the structure of the resource.

This mapper:

{
    "resourceType": "Patient"
}

is a valid mapper that returns exactly the same structure:

{
    "resourceType": "Patient"
}

All strings are treated as constant values unless they start with {{ and end with }}. The text inside {{ and }} is a FHIRPath expression.

To extract the patient's birthDate, use:

{
    "resourceType": "Patient",
    "birthDate": "{{ QuestionnaireResponse.repeat(item).where(linkId='2').answer.value }}"
}

The result will be:

{
    "resourceType": "Patient",
    "birthDate": "2023-05-03"
}

To extract the name, phone number, and email fields:

{
    "resourceType": "Patient",
    "birthDate": "{{ QuestionnaireResponse.repeat(item).where(linkId='2').answer.value }}",
    "name": [
        {
            "given": [
                "{{ QuestionnaireResponse.repeat(item).where(linkId='1').answer.value }}"
            ]
        }
    ],
    "telecom": [
        {
            "value": "{{ QuestionnaireResponse.repeat(item).where(linkId='phone').answer.value }}",
            "system": "phone"
        },
        {
            "value": "{{ QuestionnaireResponse.repeat(item).where(linkId='email').answer.value }}",
            "system": "email"
        }
    ]
}

To extract gender, a more complex expression is needed:

QuestionnaireResponse.repeat(item).where(linkId='4.1').answer.value.code

because the patient's gender is a token while the question item type is Coding.

The final mapper will look like this:

{
    "resourceType": "Patient",
    "birthDate": "{{ QuestionnaireResponse.repeat(item).where(linkId='2').answer.value }}",
    "name": [
        {
            "given": [
                "{{ QuestionnaireResponse.repeat(item).where(linkId='1').answer.value }}"
            ]
        }
    ],
    "telecom": [
        {
            "value": "{{ QuestionnaireResponse.repeat(item).where(linkId='phone').answer.value }}",
            "system": "phone"
        },
        {
            "value": "{{ QuestionnaireResponse.repeat(item).where(linkId='email').answer.value }}",
            "system": "email"
        }
    ],
    "gender": "{{ QuestionnaireResponse.repeat(item).where(linkId='4.1').answer.value.code }}"
}

Null key removal

If an expression resolves to an empty set {}, the key will be removed from the object.

For example, if the gender field is missing in the QuestionnaireResponse from the example above:

{
    "resourceType": "Patient",
    "gender": "{{ QuestionnaireResponse.repeat(item).where(linkId='4.1').answer.value.code }}"
}

this template will be mapped into:

{
    "resourceType": "Patient"
}

Null key retention

Note: This feature is not mature enough and might change in the future.

To preserve the null value in the final result, use {{+ and +}} instead of {{ and }}:

{
    "resourceType": "Patient",
    "gender": "{{+ QuestionnaireResponse.repeat(item).where(linkId='4.1').answer.value.code +}}"
}

The result will be:

{
    "resourceType": "Patient",
    "gender": null
}

Note: This feature is not mature enough and might change in the future.

Automatic array flattening and null removal

In FHIR resources, arrays of arrays and arrays of nulls are invalid constructions. To simplify writing mappers, there is automatic array flattening.

For example:

{
    "list": [
        [
            1, 2, null, 3
        ],
        null,
        [
            4, 5, 6, null
        ]  
    ]
}

will be mapped into:

{
    "list": [
        1, 2, 3, 4, 5, 6
    ]
}

This is especially useful if there is conditional and iteration logic used.

String concatenation

String concatenation might be implemented using fhirpath string concatenation using + sign, e.g.

{
    "url": "{{ 'Condition?patient=' + %patientId }}"
}

or using liquid syntax

{
    "url": "Condition?patient={{ %patientId }}"
}

Caveats

Please note that string concatenation will be executed according to FHIRPath rules. If one of the variables resolves to an empty result, the entire expression will be empty result.

For empty %patientId:

{
    "url": "Condition?patient={{ %patientId }}"
}

will be transformed into:

{}

and using null key retention syntax:

{
    "url": "Condition?patient={{+ %patientId +}}"
}

will be transformed into:

{
    "url": null
}

Scoped constant variables

A special construction allows defining custom constant variables for the FHIRPath context of underlying expressions:

{
    "{% assign %}": [
        {
            "varA": 1
        },
        {
            "varB": "{{ %varA + 1 }}"
        }
    ]
}

Note that %varA is accessed using the percent sign. It means that %varA is from the context. The order in the array is important. The context variables can be accessed only in the underlying expressions, including nested arrays/objects. For example:

{
    "{% assign %}": [
        {
            "birthDate": "{{ QuestionnaireResponse.repeat(item).where(linkId='2').answer.value }}" 
        }
    ],
    "resourceType": "Bundle",
    "entry": [
        {
            "resource": {
                "resourceType": "Patient",
                "birthDate": "{{ %birthDate }}"
            }
        }
    ]
}

will be transformed into:

{
    "resourceType": "Bundle",
    "entry": [
        {
            "resource": {
                "resourceType": "Patient",
                "birthDate": "2023-05-03"
            }
        }
    ]
}

Conditional logic

FHIRPath provides conditional logic for primitive values like booleans, strings, and numbers using the iif function. However, there are scenarios where conditional logic needs to be applied to map values to complex structures, such as JSON objects.

For these cases, a special construction is available in the FHIRPath mapping language:

{
    "{% if expression %}": {
        "key": "value true"
    },
    "{% else %}": {
        "key": "value false"
    }
}

where expression is FHIRPath expression that is evaluated in the same way as the first argument of iif function.

For example:

{
    "resourceType": "Patient",
    "address": {
        "{% if QuestionnaireResponse.repeat(item).where(linkId='country').answer.exists() %}": {
            "type": "physical",
            "country": "{{ QuestionnaireResponse.repeat(item).where(linkId='country').answer.value }}"
        }
    }
}

will be mapped into:

{
    "resourceType": "Patient",
    "address": {
        "type": "physical",
        "country": "US"
    }
}

Implicit merge

It also makes implicit merge, in case when if/else blocks return JSON objects, for example:

{
    "resourceType": "Patient",
    "address": {
        "type": "physical",
        "{% if QuestionnaireResponse.repeat(item).where(linkId='country').answer.exists() %}": {
            "country": "{{ QuestionnaireResponse.repeat(item).where(linkId='country').answer.value }}"
        },
        "{% else %}": {
            "text": "Unknown"
        }
    }
}

The final result will be either

{
    "resourceType": "Patient",
    "address": {
        "type": "physical",
        "country": "US"
    }
}

or

{
    "resourceType": "Patient",
    "address": {
        "type": "physical",
        "text": "Unknown"
    }
}

In this example, Patient address contains original {"type": "physical"} object and country/text is implicitly merged based on condition.

Iteration logic

To iterate over the array of values, here's a special construction:

{
    "{% for item in QuestionnaireResponse.item %}": {
        "linkId": "{{ %item.linkId }}"
    }
}

that will be transformed into:

[
    { "linkId": "1" },
    { "linkId": "2" },
    { "linkId": "4.1" },
    { "linkId": "phone" },
    { "linkId": "email" },
    { "linkId": "country" }
]

Using index

{
    "{% for index, item in QuestionnaireResponse.item %}": {
        "index": "{{ %index }}",
        "linkId": "{{ %item.linkId }}"
    }
}

that will be transformed into:

[
    { "index": 0, "linkId": "1" },
    { "index": 1, "linkId": "2" },
    { "index": 2, "linkId": "4.1" },
    { "index": 3, "linkId": "phone" },
    { "index": 4, "linkId": "email" },
    { "index": 5, "linkId": "country" }
]

Merge logic

To merge two or more objects, there is a special construction:

{
    "{% merge %}": [
        {
            "a": 1
        },
        {
            "b": 2
        } 
    ]
}

that will be transformed into:

{
    "a": 1
    "b": 2
}

Examples

See real-life examples of mappers for FHIR and Aidbox

and other usage in unit tests.

Reference implementation

TypeScript implementation that supports all the specification is already available in this repository. Also, it is packed into a docker image to use as a microservice.

Usage

POST /r4/parse-template

{
    "context": {
        "QuestionnaireResponse": {
            "resourceType": "QuestionnaireResponse",
            "id": "foo",
            "authored": "2024-01-01T10:00:00Z"
        }
    },
    "template": { 
        "id": "{{ id }}",
        "authored": "{{ authored }}",
        "status": "completed"
    }
}

Strict mode

FHIRPath provides a way of accessing the resource variables without the percent sign. It potentially leads to the issues made by typos in the variable names.

There's a runtime flag, called strict that is set to false by default. If it set to true, all accesses to the variables without the percent sign will be rejected and exception will be thrown.

The previous example should be re-written as

POST /r4/parse-template

{
    "context": {
        "QuestionnaireResponse": {
            "resourceType": "QuestionnaireResponse",
            "id": "foo",
            "authored": "2024-01-01T10:00:00Z"
        }
    },
    "template": { 
        "id": "{{ %QuestionnaireResponse.id }}",
        "authored": "{{ %QuestionnaireResponse.authored }}",
        "status": "completed"
    }
    "strict": true
}