New data: waiting list (WIP)

HelenCEBM commented 2 years ago

Background: Document

Schema: Schema

Recording and reporting guidelines: Link

Data flow/structure & complications

There are two main datasets:

"Open pathways" dataset - patients on waiting list (also includes patients who are no longer waiting e.g. procedure no longer required)
"Clockstops" dataset - completed wait on waiting list (had procedure)

At a given time point a patient may be:

not on waiting list;
on waiting list;
no longer on waiting list (had procedure)
no longer on waiting list (cancelled)
(or a combination of several statuses for different pathways/procedures)

Data issues to consider

Issue	"Open pathways" dataset	"Clockstops" dataset
Different pathways/procedures for same patient	Patients may be waiting for more than one procedure at any one time. We will need each of these to be available for analysis	Patients appear for each completed pathway (and may have same procedure more than once e.g. cataract). We will need each of these to be available for analysis
Duplicates over time	Patients appear each week they are on the waiting list. We probably generally only need the latest for each patient-pathway combination within the period of interest (however, useful to look back at how their pathway has changed)	(Patients should appear only once per completed pathway)
Change over time	Information may change while patients are on waiting list, e.g. change of planned procedure, cancellation (procedure no longer required) etc	N/A

Potential research/monitoring questions:

for completed waits* each year/month, how long were patients waiting? (+ what are their features etc)
for patients on waiting list* as of this month and previous months, how many have been waiting >X months? (+ what are their features etc)
what other care/treatments do patients receive while on waiting list?

*Note: in cohortextractor we can only include one wait per patient in each period (but usually we will filter to a certain specialty, set of procedures or urgency, so duplicates should be minimised)

Data elements required

how long have they been / were they waiting (there are several possible start dates)
what procedure (OPCS-4) is planned/was done
type of pathway
- ORTT - Current RTT Non-admitted (patients whose RTT pathways ended for reasons other than admission for treatment)
- IRTT - Current RTT Admitted
- ONON - Not current RTT Non-admitted (patients whose RTT pathways ended for reasons other than admission for treatment)
- INON - Not current RTT Admitted
- Potentially we could initially limit to the subset of patients with IRTT Current RTT Admitted. It may later be useful to consider patients who've been removed from the list / are in a monitoring period etc.
- Status - provides further breakdown of pathway type, may not be required initially
specialty
source of referral (GP or not)
urgency
- 1 Routine
- 2 Urgent
- 3 Two Week Wait
procedure/diagnostic priority?
on cancer PTL
- "The 'Cancer 62 Day Patient Tracking List' (CANPTL) collection is a weekly snapshot which shows the number of patients on the cancer 62-day pathway, who are at risk of breaching the 62-day standards."

Returning options

Note: may need to minimise options due to duplicate records per patient.

date joined waiting list
date left waiting list (?latest submission week in which they were present in the data, or date at which their status changed to non-admitted/non-RTT?)
binary_flag
urgency
procedure

Filters

between/on_or_before/on_or_after -
- how will this work?
- use submission week for patients on a waitlist as of given week?
type of pathway
status
urgency
on cancer PTL

Date filters/calculations

Questions

Look up more about procedure/diagnostic priority
Do we need an intermediate table for Open Pathways? I.e. Data as of latest week patient was present, and any key fields that changed throughout the time this pathway ID has been on waiting list?
Plausibility checking (OpenPathways)
- Will need to select a subset of OpenPathways data to analyse due to its size - last X months by recording week won't show complete journeys but may be enough to indicate how often things change... By Specialty? Journeys may differ by specialty... One ethnic category??
- How many records per patient/pathway ID
- How often does Waiting_List_Type, ACTIVITY_TREATMENT_FUNCTION_CODE, PRIORITY_TYPE_CODE change within a single pathway?
- How does REFERRAL_REQUEST_RECEIVED_DATE vary with respect to REFERRAL_TO_TREATMENT_PERIOD_START_DATE (RTT pathways) andCurrent_Pathway_Period_Start_Date` (for non-RTT pathways)
- Is Outpatient_Appointment_Date well populated and useful? (How do these dates compare to others?)
- Is TCI_Date (Patient To Come In date) well populated and useful?
- What does a completed REFERRAL_TO_TREATMENT_PERIOD_END_DATE correspond to? (should be blank for open pathways). Is it completed when a pathway ends in surgery or only if a pathway ends for another reason
- Do ended pathways (e.g. cancellations) disappear after ending?

rebkwok commented 2 years ago

@HelenCEBM

type of pathway ORTT - Current RTT Non-admitted (patients whose RTT pathways ended for reasons other than admission for treatment) IRTT - Current RTT Admitted ONON - Not current RTT Non-admitted (patients whose RTT pathways ended for reasons other than admission for treatment) INON - Not current RTT Admitted

What is the difference between ORTT and ONON (the labels are different, but the descriptions in parentheses are the same)?

HelenCEBM commented 2 years ago

What is the difference between ORTT and ONON (the labels are different, but the descriptions in parentheses are the same)?

AIUI, non-admitted means the pathway ended for some reason other than having the procedure (e.g. no longer needs the procedure) ..while the RTT/non-RTT distinction is related to what kind of referral it is - RTT pathways are those that meet certain requirements (e.g. to a consultant-led service or a triage service) while non-RTT is all others (e.g. referral to a non-consultant led service).

HelenCEBM commented 2 years ago

Referral vs pathway IDs:

Referral ID automatically created by e-referral system
Pathway ID created by the hospital receiving the referral
- pathway IDs are only unique when combined with trust ID as hospitals may by chance create similar IDs for different patients
Every record must have both IDs
We expect there's usually a 1:1 ratio but:
- some referrals may generate multiple pathways e.g. if patient is "referred on" to a different hospital.
- We think that (depending on the hospital), if an RTT pathway becomes non-RTT, it may generate a new referral ID but may retain the same pathway ID (and should retain the same REFERRAL_REQUEST_RECEIVED_DATE).

Available DATES (Open Pathways):

Week_Ending_Date
REFERRAL_REQUEST_RECEIVED_DATE
DECISION_TO_ADMIT_DATE - "Decision to admit date (patient added to admitted waiting list)"
REFERRAL_TO_TREATMENT_PERIOD_START_DATE - RTT Start date - this should be present once the patient has seen a consultant and been listed to have a specific procedure. It will be removed if the pathway becomes non-RTT.
Current_Pathway_Period_Start_Date - equivalent to the previous field but for non-RTT pathways
REFERRAL_TO_TREATMENT_PERIOD_END_DATE - present if RTT period has ended (e.g. procedure cancelled)

Some date filter options (Open pathways):

Currently on waiting list AS OF a particular week - bit different to how OS queries normally work?
Has been on waiting list (present in the data) during a given date period
Was REFERRED during a given date period (potentially separate out referrals from RTT pathways?)
Started an RTT pathway during a given date period (i.e. with a procedure code listed)

rebkwok commented 2 years ago

If a patient is present in the open pathways dataset, do we assume they are on a waiting list (RTT or otherwise)? The 4 statuses refer to RTT pathways ending, but I think that means they're still on a waiting list, just not an RTT one?

With the exception (maybe) of entries that have a cancelled date?

rebkwok commented 2 years ago

From the guidance, I think records with a cancellation date are still on the waiting list

4.4.2 Cancelled and rearranged appointments A cancelled or rearranged appointment, either patient-initiated or provider-initiated will not in itself stop an RTT clock.

rebkwok commented 2 years ago

The simplest implementation for cohort-extractor would be to look at a snapshot of patients who are on a waiting list at a specific date. We would need to take a single reference date (e.g. 28 Feb 2022), and find the records with a week_ending_date for the next Sunday (unless the reference date is a Sunday itself). Assuming we can consider REFERRAL_REQUEST_RECEIVED_DATE as the first date a patient joined a waiting list, waiting list time is the difference between this date and the supplied reference date.

For patients who are on more than one waiting list (with any other matching filters applied), we'll need to select one record; we can use select_first_match_in_period to select the longest wait time (by earliest REFERRAL_REQUEST_RECEIVED_DATE)

Time on waiting list - this could be a filter or a return value (or both, I guess)

HelenCEBM commented 2 years ago

Some notes from the reporting guidance:

Patient pathway A patient pathway is usually considered to be their journey from first contact with the NHS for an individual condition, through referral, diagnosis and treatment for that condition. For chronic or recurrent conditions, a patient pathway will continue beyond the point at which first definitive treatment starts, as it will include further treatment for the same condition. A person may therefore have multiple RTT periods (see Referral to treatment period) along one patient pathway.

Referral to treatment period An RTT period is the time between a person’s referral to a consultant-led service, which initiates a clock start, and the point at which the clock stops for any of the reasons set out in the RTT national clock rules, for example the start of first definitive treatment or a decision that treatment is not appropriate.

A patient pathway identifier (PPID) should be assigned to a pathway arising from a referral for a particular condition where this is a referral within the scope of the RTT measure. At the beginning of the patient journey the first organisation receiving the referral should generate a Patient Pathway Identifier (which may be based on the Unique Booking Reference Number (UBRN)). This along with the Organisation Code of that organisation (the Organisation Code of the PPID Issuer) should be used consistently to record the unique identifier for the pathway. The clock start date should also be recorded. Where the patient’s RTT pathway or individual RTT periods within that pathway are delivered by more than one organisation, it is essential that the same PPID and Organisation Code of PPID Issuer are applied, in other words, they do not change even where the responsibility for patient care transfers to a different organisation.

note that where the initial referral was received via the NHS e-Referral Service and the UBRN is used as the basis of the PPID, then the organisation code of PPID Issuer is X09;

^ we should check whether the receiving org ID is useful as an org identifier or whether only the current org ID should be used for trust-level variation

HelenCEBM commented 2 years ago

If a patient is present in the open pathways dataset, do we assume they are on a waiting list (RTT or otherwise)? The 4 statuses refer to RTT pathways ending, but I think that means they're still on a waiting list, just not an RTT one?

With the exception (maybe) of entries that have a cancelled date?

Yes I think this is correct!

rebkwok commented 2 years ago

From Chris in this thread:

Just so you know, things like the referral identifier and pathway identifier are going to be pretty useless on their own. There are loads of NULLs and then combinations of all flavour of abbreviations of “not applicable” and “99999999”. There’s also a lot of classic excel issues - e.g. where the hospital team have clearly used excel as an interim to data upload and it’s converted long ids into XXXXE+7 type notation and so lost the identifier integrity.

Once we obfuscate these, you won't have any idea which ones are legitimate and which ones aren't.

This should mean that the simpler implementation we've discussed (looking at waiting list records at a particular snapshor date) is fine, but the more complex ideas (e.g. looking for patients who dropped off the waiting list during a period) will be difficult. We'll probably be able to identify pathways/referrals by start date and patient ID but we won't be able to rely on referral ID to differentiate

robinyjpark commented 2 years ago

@iaindillingham – as part of the data validation pipeline, we envisioned that there would be two steps to implementing new data in OpenSAFELY.

Firstly, we would want to produce a schema and report data types and completeness. As examples, please see the ISARIC notebook and notes on the therapeutics data.

Secondly, further checks should be done to determine the meaning of each field, whether any fields contain sensitive information that should not be used, and discover any other unexpected features or limitations of the data. This can be done using the raw data plausibility checking functions that Helen wrote (documentation here, repo here, helpful Slack thread here).

iaindillingham commented 2 years ago

According to Chris in this thread, the waiting list data was added over the weekend of 23/24 July.

rebkwok commented 2 years ago

https://github.com/opensafely/data-exploration-notebooks/blob/main/waiting_lists/waiting_list_data_exploration.ipynb :arrow_up: Some first explorations of the 3 waiting list tables, using a modified version of Helen's notebooks.

My first concern is the missing values for the Week_Ending_Date; this is supposed to be (according to the schema spreadsheet) "the Sunday of the week that the pathway relates to". I expected it to always be present, but there are >10 million missing values. There are also a lot of missing waiting list type values, and a lot more waiting list types that I'd expected as well (schema spreadsheet lists ORTT, IRTT, ONON, INON), but there are lots more than that - looking at the distinct values it seems like these values aren't constrained - we've got values like "unkn", "Not-" as well as nulls.

brianmackenna commented 2 years ago

some scratch notes from meeting the WL MDS team https://docs.google.com/document/d/1Y4keZ51WDs-DE2PyLL2XOs5ju9opphBhpYr7aczFtYc/edit

opensafely-core / cohort-extractor