Closed ThomUK closed 1 year ago
@chrismainey @johnmackintosh This issue came in via the MDC team. Do you have any thoughts / guidance?
From memory, the logic of screening outliers is to remove from the moving range, it's not to remove them as points from the process. Unless the logic is implemented incorrectly from the paper that it is based on I don't think we should remove these points from the calculation of the mean
Some of the confusion may be my inaccurate language. I'll see if the original requestor can be invited to this conversation.
Just seeing this. If the moving range calculation is being amended (by correctly removing the 'outliers'), then everything downstream from that (including the mean) would be revised. You only do this step once though. Will need to double check, as its been a while, but my initial reaction that removing these points from MR calcs but leaving them in the mean calculation is incorrect..
Will update once I've had a chance to look into it
looking at code from qicharts2 - file is R/helper.functions.R, function is qic.i, line 139.
the mean used there is x$cl, created on line 142 (if you don't manually provide a cl). this value isn't updated after screening for outliers, so at least in that implementation the mean isn't recalculated
what is recalculated is the moving range (mr) and the average moving range (amr) on lines 152 and 153.
this is exactly the same as how we calculate in ptd_spc_standard()
https://github.com/nhs-r-community/NHSRplotthedots/blob/60a56a632e46a8bd550e344fc5193ec7ac88f272/R/ptd_spc_standard.R#L77
@tomjemmett thanks tom - you're right of course, the mean does not get recalculated, nor does the moving range, but average moving range does, and therefore both the UL & LL would also be updated.
here's what I understood it to be (from the healthcare data guide)
calculate moving ranges calculate average moving range calculate UL screen and remove points above UL recalculate average moving range calculate mean (using all data points, including outliers) recalculate UL & LL using new average moving range.
Closing this as the person raising the question/concern did not join the discussion, and we are all happy with the current logic (as summarised by John's last comment).
Current state: Exceptional points (those placed outside the process limits) are removed from the calculation of the process limits, but they are still included in the calculation of the mean. This places the mean line off-centre when exceptional points are present.
Desired state: Exceptional points should be removed from the calculation of both the process limits and the mean. The mean remains centred between process limits.