aehrc / pathling

Tools that make it easier to use FHIR® and clinical terminology within data analytics, built on Apache Spark.
https://pathling.csiro.au
Apache License 2.0
91 stars 12 forks source link

Preferred language configuration parameter #1191

Closed johngrimes closed 1 year ago

johngrimes commented 1 year ago

This change adds a configuration value (pathling.terminology.client.acceptLanguage) that controls the language preference communicated to the server using the Accept-Language header when making terminology requests.

It will also add an optional language parameter to the display and property FHIRPath functions, and the display and property_of UDFs. This will allow the user to override the configured value within the query, possibly querying multiple different languages within the same query.

See Multi-language support in FHIR.

Markopolo141 commented 1 year ago

pushed to branch 'issues/1191'

Change the accept_language option was added to the python context.py, and also into the HTTPclientconfig (which I felt was more appropriate) code was tested in python and java, and accept_language header HTTP request were inspected using 'HTTPtools'.

Justification once the Apache httpClient is created it is then passed into hapi-fhir's TerminologyClient, where it is invoked therein entirely according to its own methods but is potentially dangerous to access (and therefore presumably change), as the only accesor is:

IRestfullClient.java in HapiFHir respository has: //Do not call this method in client code. It is a part of the internal HAPI API and is subject to change! IHttpClient getHttpClient();

In this way, it makes sense only to have accept_language as startup config option, but not dynamically changeable.

other changes

Potential future changes

JAVA testing code

import au.csiro.pathling.sql.Terminology;
import static au.csiro.pathling.library.TerminologyHelpers.*;
import au.csiro.pathling.library.PathlingContext;
import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.SparkSession;
import org.hl7.fhir.r4.model.Coding;
import au.csiro.pathling.config.TerminologyConfiguration;
import au.csiro.pathling.config.HttpClientConfiguration;
import au.csiro.pathling.config.TerminologyAuthConfiguration;

class MyApp {
    public static void main(String[] args) {
        SparkSession spark = SparkSession
          .builder()
          .config("spark.master", "local")
          .getOrCreate();

        TerminologyConfiguration configuration = TerminologyConfiguration.builder()
            .client(HttpClientConfiguration.builder()
                .acceptLanguage("de")
                .build()
            ).build();

        PathlingContext pc = PathlingContext.create(spark, configuration);
        Dataset<Row> csv = pc.getSpark().read().option("header", "true").csv("conditions.csv");

        // Get the synonyms for each code in the dataset.
        Dataset<Row> synonyms = csv.withColumn(
                "SYNONYMS",
                Terminology.designation(toSnomedCoding(csv.col("CODE")),
                        new Coding("http://snomed.info/sct",
                                "900000000000013009", null))
        );
        // Split each synonym into a separate row.
        Dataset<Row> exploded_synonyms = synonyms.selectExpr(
                "CODE", "DESCRIPTION", "explode_outer(SYNONYMS) AS SYNONYM"
        );
        exploded_synonyms.show();
    }
}

PYTHON testing code

from pathling import PathlingContext, to_snomed_coding, Coding, designation

pc = PathlingContext.create(accept_language="devil")
csv = pc.spark.read.option("header","true").csv("conditions.csv")

# Get the synonyms for each code in the dataset.
synonyms = csv.withColumn(
    "SYNONYMS",
    designation(to_snomed_coding(csv.CODE), Coding.of_snomed("900000000000013009")),
)
# Split each synonyms into a separate row.
exploded_synonyms = synonyms.selectExpr(
    "CODE", "DESCRIPTION", "explode_outer(SYNONYMS) AS SYNONYM"
)
exploded_synonyms.show()
johngrimes commented 1 year ago

Thanks @Markopolo141, I'll take a look at this tomorrow!

Markopolo141 commented 1 year ago

I accidentally moved this issue to backlog. :-(

johngrimes commented 1 year ago

In this way, it makes sense only to have accept_language as startup config option, but not dynamically changeable.

I think this assumption is fine.

johngrimes commented 1 year ago

This work looks great so far! The configuration parameter for controlling Accept-Language is working well.

I would like to see if we can take this a bit further now. I've added some extra requirements to this issue, @Markopolo141 take a look and let me know what you think. It would be great if we could now enhance the functions themselves to allow for the language to be overridden within an invocation.