ThomUK / SPCreporter

Creates Metric Reports using Statistical Process Control in the NHS style
https://thomuk.github.io/SPCreporter/
Other
6 stars 2 forks source link

time between calculations should include initial event (with a value of zero days since), not omit it #105

Closed francisbarton closed 8 months ago

francisbarton commented 10 months ago

Currently the process seems to omit the first event on the list, and just returns a value for event 2 based on days since event 1. But we would want an SPC chart to still include event 1, it might be an important event to plot.

francisbarton commented 10 months ago

My initial code for time-between calculations (written in DH) also required a start_dttm (start of reporting period) so that even the first event potentially had a "days since" value (that is, days since start of reporting period).

ThomUK commented 8 months ago

I'd like to do some research on this to understand what other tools do (eg. NHSE MDC).

My starting point is that I'm not keen on the position of the first point being defined nominally by the start of the reporting window - I think there is a risk of "signal" being forced into a plot that isn't in the underlying data.

On the other hand I can see the disadvantage of the first data point being effectively hidden, which is a bigger problem for "very rare" events than it is for "rare" events

ThomUK commented 8 months ago

I have researched the behaviour of the NHSE MDC "T- chart" excel tool. That tool also drops the first point, which is used only to calculate the y axis value of the 2nd event.

I'm going to leave the behaviour as it currently is because:

  1. We then remain consistent with NHSE MDC team.
  2. Including the first event with a nominal y value (either 0 or based on the start of the reporting period) will still influence the calculation of mean and process limits. It could also flag as special-cause which would be potentially misleading.
  3. I don't believe there is harm in dropping the first point, because points at the left-hand side of the time-series are less important than at the right-hand end, which is where the signal and discussion should be happening.

Leaving this comment here for others to pick up on in future if required.