opensafely-core / ehrql

ehrQL: the electronic health record query language for OpenSAFELY
https://docs.opensafely.org/ehrql/
Other
7 stars 3 forks source link

Implement Index of Multiple Deprivation #96

Closed rebkwok closed 3 years ago

rebkwok commented 3 years ago

IMD queries in the old cohort extractor are calculated by finding the patients with addresses registered on the index_date, and then ordering them by

ORDER BY
              StartDate DESC,
              EndDate DESC,
              IIF(MSOACode = 'NPC', 1, 0),
              PatientAddress_ID

This orders by most recent start date, then latest end date, then prefers addresses which are not marked "NPC" (for "No Postcode") and finally uses the address ID as a tie-breaker.

The IMD query in the old cohort extractor could almost be implemented as a last_by method in the new version, like:

imd_value = (
        table("patient_address")
        .date_in_range(index_date)
        .last_by("date_start", "date_end", "has_postcode", "patientaddress_id")
        .get("index_of_multiple_deprivation")
    )

Except that has_postcode is not a field, it's the IIF statement on MSOACode.

A Row could accept an expression as a column (with relevant DSL wrapping for defining the expression). We probably still want a custom method of some sort so that researchers don't have to type the column ordering explicitly though.

rebkwok commented 3 years ago

Simplest MVP solution looks to be adding a custom has_postcode field to the patient_address table which we can use as a sort column