nhs-r-community / NHSRplotthedots

An SPC package to support NHSE/I 'Making Data Count' programme
https://nhs-r-community.github.io/NHSRplotthedots/
Other
48 stars 23 forks source link

Exceptional points should be removed from calculation of both limits and mean #152

Closed ThomUK closed 1 year ago

ThomUK commented 2 years ago

Current state: Exceptional points (those placed outside the process limits) are removed from the calculation of the process limits, but they are still included in the calculation of the mean. This places the mean line off-centre when exceptional points are present.

Desired state: Exceptional points should be removed from the calculation of both the process limits and the mean. The mean remains centred between process limits.

ThomUK commented 2 years ago

@chrismainey @johnmackintosh This issue came in via the MDC team. Do you have any thoughts / guidance?

tomjemmett commented 2 years ago

From memory, the logic of screening outliers is to remove from the moving range, it's not to remove them as points from the process. Unless the logic is implemented incorrectly from the paper that it is based on I don't think we should remove these points from the calculation of the mean

ThomUK commented 2 years ago

Some of the confusion may be my inaccurate language. I'll see if the original requestor can be invited to this conversation.

johnmackintosh commented 2 years ago

Just seeing this. If the moving range calculation is being amended (by correctly removing the 'outliers'), then everything downstream from that (including the mean) would be revised. You only do this step once though. Will need to double check, as its been a while, but my initial reaction that removing these points from MR calcs but leaving them in the mean calculation is incorrect..

Will update once I've had a chance to look into it

tomjemmett commented 2 years ago

looking at code from qicharts2 - file is R/helper.functions.R, function is qic.i, line 139.

the mean used there is x$cl, created on line 142 (if you don't manually provide a cl). this value isn't updated after screening for outliers, so at least in that implementation the mean isn't recalculated

what is recalculated is the moving range (mr) and the average moving range (amr) on lines 152 and 153.

this is exactly the same as how we calculate in ptd_spc_standard() https://github.com/nhs-r-community/NHSRplotthedots/blob/60a56a632e46a8bd550e344fc5193ec7ac88f272/R/ptd_spc_standard.R#L77

johnmackintosh commented 2 years ago

@tomjemmett thanks tom - you're right of course, the mean does not get recalculated, nor does the moving range, but average moving range does, and therefore both the UL & LL would also be updated.

here's what I understood it to be (from the healthcare data guide)

calculate moving ranges calculate average moving range calculate UL screen and remove points above UL recalculate average moving range calculate mean (using all data points, including outliers) recalculate UL & LL using new average moving range.

ThomUK commented 1 year ago

Closing this as the person raising the question/concern did not join the discussion, and we are all happy with the current logic (as summarised by John's last comment).