google / fhir-gateway

A generic proxy server for applying access-control policies for a FHIR-store.
Other
70 stars 28 forks source link

Allow mutation of the RequestContent through AccessDecision. #258

Closed vbothe23 closed 3 months ago

vbothe23 commented 5 months ago

The current functionality of the FHIR gateway only allows the mutation of the query parameters. We have the requirement to modify the request content for de-identification purposes.

Specific Scenario: We have various resources such as Encounter, Observation, Condition etc., that reference a patient. These resources contain the patient name within the subject parameter, along with a reference to the patient.

Consider following Encounter resource as an example:

{
    "resourceType": "Encounter",
    "id": "4457",
    "status": "finished",
    "class": {
        "system": "http://terminology.hl7.org/CodeSystem/v3-ActCode",
        "code": "AMB",
        "display": "ambulatory"
    },
    "subject": {
        "reference": "Patient/11232",
        "display": "John"
    },
    "period": {
        "start": "2023-06-14T07:00:00+07:00",
        "end": "2023-06-14T09:00:00+07:00"
    }
}

Before uploading these resources to the server, we need to remove the patient's identifying information from the resource ( i.e., encounter.subject.display )

bashir2 commented 4 months ago

Thanks @vbothe23 for filing this issue and also discussing it in the developer call. Can you please add some clarification on the design/architecture that you have, I mean similar to what you described in the dev. call?

One concern that I have is that our thinking in designing the fhir-gateway did not include the scenario you are trying to address. IOW, the FHIR-server was always the trusted authority and we did not need to hide anything from it. That said, I have thought a little bit about your proposed change and I have not found any immediate issues yet.

Also just to make sure you have considered this, since I think you mentioned this is for an analytics use-case: In your design/architecture, is it possible to use fhir-data-pipes to read data from the FHIR server? In that case, you can do the anonymization during the pipeline run (still inside the gateway, but in the reverse direction).

bashir2 commented 4 months ago

@vbothe23 a reminder to please add two pieces of information: (1) the architecture/design you have that requires this and (2) whether fhir-data-pipes can solve your problem without this feature in the gateway.

In particular, regarding fhir-data-pipes, please note that now we have support for ViewDefinition, i.e., you can define a flat view for a FHIR resource based on FHIRPaths and then materialize that view in a PostgreSQL table. IIRC this is close to one of analytics architectures you had. Some example views are here.

vbothe23 commented 4 months ago

Hi @bashir2 , Sorry for delay in responding. I have not tested the fhir-data-pipes yet. Currently we do not need to use the FHIR-path for this task. However I have tested the initial approach that you mentioned earlier, which involves using postProcess, and we have decided to continue with that approach.

image

We use the FHIR gateway between the FHIR server and the client, processing the FHIR response in postProcess() method of AccessDecision. In this architecture ETL pipeline is the client that requests FHIR data from the server.

This approach solves our problem without making changes to the gateway component.

fredhersch commented 3 months ago

Thanks @vbothe23 and team, just adding some thoughts here.

I do have some concerns about this in terms of it being a "core feature" of the FHIR Info Gateway. Mutating data that is "going into" a FHIR source could open up issues of unintended data misuse (or intentional).

If you are going to add this as a feature in your AccessChecker Plugin, I think it needs to be clear where the "data source of truth" is and potentially it would make sense to add audit tracking of the data the "events" that are leading to data being changed.

We have a feature on the roadmap to add support for the "AuditEvents" Resource that woudl be a good way to surface this for an administrator who can then review and make sure that no suspicious activity is happening.

In terms of alternatives, if the goal is to have redacted data for reporting or analytics, then the FHIR Data Pipes would be a better option - as @bashir2 has pointed out, using the ViewDefinition Resources, you could define the Views in FHIRPath and ensure that the identifiers were not in the materialized views.

Let us know what approach you decide to take.

vbothe23 commented 3 months ago

Thanks @fredhersch for the update. We will try to explore the fhir-data-pipes with ViewDefinition and FHIR Path. For now, will continue with postProcess approach in the FHIR gateway which is suggested by @bashir2 earlier. We will mutate the response from the FHIR server before forwarding it to the client.

This way, we are not mutating the data that goes into the FHIR server and, FHIR server still remain the data source of truth.

This approach solves our problem without making changes to the core gateway component.