Api BMI response not returning any errors for invalid measurements

ryanlewis94 commented 8 months ago

More of an issue with the api rather than the chart. All other measurements will return the standard measurement response with some error fields populated with helpful messages if the user enters a measurement below or above the constants. This is helping us determine if it should be passed into the chart or to display a warning message to the user. However when entering values above or below the constants for BMI I have noticed that measurement response comes back with no errors in any of the error nodes, meaning our application deems the response is correct and will try to pass it into the chart component. This can be worked around with more validation on our input form to stop users entering values that are too high or low in the first place but I was wondering if this is as designed or a known issue

eatyourpeas commented 8 months ago

BMI validation is tricky and I might ask @statist7 and Charlotte Wright here to comment. BMI which plots in the +5 and +6 SDS category, which should be vanishingly uncommon is in fact common now in clinical practice, so setting sensible upper limits is not possibly as easy as it might seem. Perhaps we could add this to the list of items to discuss at the next project board.

statist7 commented 8 months ago

Agreed. But if either height or weight get flagged as being dubious, should BMI be flagged automatically too?

raspberrycodes commented 7 months ago

Following discussion with @pacharanero on 29-Feb-2024.

Here is some extra information about what we are seeing and what we 'expect' in the BMI response based upon what we observe in the responses for the other measurement types. In the examples I will just use Height because the others appear to behave similarly.

Based on the Height request, if we input a value that is below the minimum limit (2cm in this example) then the API will respond with it's 200 and the error message "The height/length you have entered is very low and likely to be an error. Are you sure you meant a height of 2.0 centimetres?" is within the node on child_observation_value/observation_value_error:

We also get message on the following nodes:

plottable_data/centile_data/chronological_decimal_age_data/observation_error
plottable_data/centile_data/corrected_decimal_age_data/observation_error

When we send a BMI 'outside the range' specified in validation_constants.py (7.5 - 105), we do not get any messages similar to what you can see above on any of the '_error' nodes. I would expect to see some kind of message saying 'this BMI is outside of the range (too big/small) are you sure?' in the following nodes:

child_observation_value/observation_value_error
plottable_data/centile_data/chronological_decimal_age_data/observation_error
plottable_data/centile_data/corrected_decimal_age_data/observation_error

...so essentially the same as the Height, Weight, and OFC. :D

I hope this is useful! I understand that performing validation on calculated BMI values is a lot more complicated than one would hope because the standard deviations can scale logarithmically. We are currently thinking about how we would like to validate our height, weight, and BMI fields on our front-end implementation so we can handle inputs client side without even having to send values that are out of bounds to the API.

eatyourpeas commented 5 months ago

I am so sorry to be late to reply here. The general principle for error handling has been to avoid hard stops except for the craziest values, but attach suitable warnings in the response to those values that are very improbable and allow them through.

The difficulty with BMI is that very positive SDS scores (which should be vanishingly rare) are actually in the modern day quite common. In my own weight management service for children, it is not that uncommon to meet children whose BMI is over 60 kg/m2. So setting impossible values here is not so easy, though for sure we could think of some.

My preference for all these values actually, rather than have hard cut offs as we currently do, lifted largely from the Guiness Book of Records, is to set an upper or lower SDS value and set the warning there.

@statist7 what do you think about this approach? And if so, where should we set our warning threshold? Should we set different cut offs for different measurement methods, or use the same for each? Perhaps we could organise a brief project board meeting to sign off some numbers?

statist7 commented 5 months ago

It's often better to set cut-offs based on z-scores rather than measurements, as they adjust for age. But BMI is tricky, as you say, because values now can get so large.

There's another subtle point. Because BMI is so skew, the relationship between BMI and z-score at a particular age is markedly nonlinear and tends to a plateau at high BMI. So you can input say BMI = 1000000 and past age 8 the z-score comes back in the relatively low range 5-6. But it's hiigher at younger ages, so it's also strongly age-dependent.

This is a naive question, but might BMI be missing the error checks because it is not an input variable, but is calculated from height and weight?

pacharanero commented 4 months ago

Thanks for your time @raspberrycodes @ryanlewis94 on the call today with myself, @mbarton, Susan Hansford and the DHCW team.

I think we had possibly not fully understood which entity was missing validation. Much of the foregoing discussion is regarding validation limits on the BMI response, whereas I think re-reading the original issue it is more about the absence of a validation response for the customer-supplied height and weight input values.

Height and weight validation limits

I think we should just check that the input height and weight are validated in the BMI API endpoint identically to how they are validated on the height and weight API endpoints. If this is not happening and observation_value_error: is null then it's likely a simple fix to supply the validation response.

BMI validation limits

BMI validation limits is somewhat more complex and as the preceding discussion outlines, we might have to implement z-score/SDS-score based limits which are likely to be more clinically useful. This is likely to take the form of a Warning rather than an error and will say something like "The calculated BMI is >6 Standard Deviations above the mean - in some circumstances this can be a correct result, however we recommend to check the input height, weight and patient age carefully"

ACTION: @pacharanero @mbarton @eatyourpeas to review API validation responses/errors for ht/wt/BMI and will add more information to this issue when we have done so.

eatyourpeas commented 4 months ago

I am thinking about an implementation currently. @statist7 in keeping with previous conversations I am inclined not to reject any requests, however crazy, but to return advisories based on SDS. If we agree that, how would you feel about warnings to check measurements for all measurement types at +/-5 SDS?

statist7 commented 4 months ago

I agree on advisories, but don’t feel constrained to have symmetric SDS cut-offs. +5 is good, but -3 or -4 would be better than -5, which represents serious thinness.

Best wishes, Tim

Sent from my iPhone

On 13 Jun 2024, at 17:36, Simon Chapman @.***> wrote:

⚠ Caution: External sender

I am thinking about an implementation currently. @statist7https://github.com/statist7 in keeping with previous conversations I am inclined not to reject any requests, however crazy, but to return advisories based on SDS. If we agree that, how would you feel about warnings to check measurements for all measurement types at +/-5 SDS?

— Reply to this email directly, view it on GitHubhttps://github.com/rcpch/rcpchgrowth-python/issues/32, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABGSJJMW6LQQXO7WXIH3BDLZHHDAJAVCNFSM6AAAAABC5NILGSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRWGE4DCMZUGY. You are receiving this because you were mentioned.Message ID: @.***>

eatyourpeas commented 4 months ago

Absolutely - I agree. -3 I think is a line on the who charts and is a big concern for any measurement type really, but particulary bmi/weight. I should have thought down there though these are clinically plausible values, so we would not be redirecting users at this point to check accuracy of measurement. Maybe -4 we should suggest rechecking measurements for all measurement types, and maybe +4 for height and ofc, +5 for weight and BMI? These would include statements along the lines of 'This value is well outside the normal range.'

statist7 commented 4 months ago

Sounds good.

I've recently been working with anthropometry data for children with rare diseases, and I included a step to flag values outside ±3 SDS using a disease-specific reference. This worked well in spotting typos, but it also picked up a few valid values. So I've increase the cut-off to ±3.5, which seems close to optimal for distinguishing between the two.

eatyourpeas commented 1 month ago

Update to this issue - discussed yesterday at project board. The general principle of not rejecting very implausible values was upheld, but there was an acceptance that we should prescribe an SDS cut off at which measurements should be considered errors and be rejected. A smaller working group (thank you @statist7 and Charlotte Wright) will work on this issue with @eatyourpeas to define what this threshold would be.

It would be implemented in the API and therefore have no implications for the charting component. I will therefore move this issue to the RCPCHGrowth-python repo.

eatyourpeas commented 1 month ago

Subsequent meeting betwen @statist7 , Prof Charlotte Wright and @eatyourpeas

Aim was to identify two SDS thresholds:

advisory threshold - to signpost to user that measurement highly unusual but still possible
error threshold - to signpost an extreme value to be rejected

Summarised nicely here by Prof Cole (@statist7 ) here:

Hi both,

Thanks for the fascinating meeting today.

I attach a plot of BMI-zscore versus BMI by age and sex for a series of high and low BMI values. It shows that there is a ceiling for BMI z-score at all ages – even with BMI as high as 1000 the z-score is capped – and at the most extreme age, around 11 years, the z-score is below 5 in boys and 6 in girls. This means that a high BMI z-score cannot be used as the error cut-off to detect infeasibly large BMI values, because higher z-scores are impossible at certain ages.

At the other end of the spectrum, low values of BMI that are feasible in anorexia nervosa lead to very extreme z-scores, below -10 for older teenagers. This represents the opposite pattern to that for high BMI, that z-scores are over-sensitive to low BMI. This contrast between low and high BMI arises from the skewness in the BMI distribution, whereby the centiles on the BMI chart are close together for low centiles and much further apart for high centiles.

Incidentally if you think the plot would work better with different BMI values do let me know – it’s easy to update.

My take from the meeting is that an advisory cut-off of ±4 z-scores would work well for height, weight and OFC, with an error cut-off of ±8 (or possibly a bit higher). However BMI needs handling differently, which is easy to do as it is calculated. The advisory cut-off of +4 is probably ok, with an error cut-off of +8, though we should recognise from the above that the error cut-off will be useless over age 5. For low BMI the error cut-off needs to be lower than -8 and perhaps even -10.

For BMI there clearly need to be absolute cut-offs as well as z-score cut-offs, and The Guinness Book of Records seems to be a good source for them.

Incidentally, if BMI is flagged as aberrant, can the code distinguish between the three possibilities: height faulty, weight faulty, both faulty, or neither faulty? If so it would need feeding back with the error message.

Best wishes, Tim

Just one further point – the existence of the cap for BMI z-score arises directly from the LMS formula:

      Centile<100a> = M (1 + L S z<a>)^1/L

The bracketed term needs to be positive, otherwise it can’t be raised to a power (unless L = 1, e.g. for height). So in the limit

      1 + L S z<a> = 0

or

      z<a> = -1 / (L S)

This defines the value of the z-score cap, which I’ve added to my previous plot as BMI = Infinity – see attached.

eatyourpeas commented 1 month ago

Decision made in summary so far for height, weight and head circumference

[ ] to implement error threshold ± 8
[ ] to implement advisory threshold ± 4 BMI is a special case however - because it is a calculated value and behaves differently at extremes, it may not be possible to use these thresholds, esp with respect to measures much in excess of ± 6, as nicely described above. It is likely we will need here a combination of SDS cut offs for prepubertal years, and maybe values from the Guiness Book of Records or similar standard thereafter. We have another meeting booked for later in the year to settle this final issue.

A supplementary decision was made to consider another jointly-authored paper to highlight this vulnerability of SDS specifically in BMI.

Once we have final consensus on the thresholds as applied to BMI, I will implement the changes and close this issue.

statist7 commented 1 month ago

Minor tweaks and updated image to the earlier post of @eatyourpeas.

Subsequent meeting betwen @statist7 , Prof Charlotte Wright and @eatyourpeas

Aim was to identify two SDS thresholds:

advisory threshold - to signpost to user that measurement highly unusual but still possible error threshold - to signpost an extreme value to be rejected Summarised nicely here by Prof Cole (@statist7 ) here:

BMI vs BMI z-score.pdf

Hi both,

Thanks for the fascinating meeting today.

I attach a plot of BMI-zscore versus BMI by age and sex for a series of high and low BMI values. It shows that there is a ceiling for BMI z-score at all ages – even with BMI at infinity the z-score is capped – and at the most extreme age, around 11 years, the corresponding z-score is below 5 in boys and 6 in girls. This means that a high BMI z-score cannot be used as the error cut-off to detect infeasibly large BMI values, because higher z-scores are impossible at certain ages.

At the other end of the spectrum, low values of BMI that are feasible in anorexia nervosa lead to very extreme z-scores, below -10 for older teenagers. This represents the opposite pattern to that for high BMI, that z-scores are over-sensitive to low BMI. This contrast between low and high BMI arises from the skewness in the BMI distribution, whereby the centiles on the BMI chart are close together for low centiles and much further apart for high centiles.

My take from the meeting is that an advisory cut-off of ±4 z-scores would work well for height, weight and OFC, with an error cut-off of ±8 (or possibly a bit higher). However BMI needs handling differently, which is easy to do as it is calculated. The advisory cut-off of +4 is probably ok, with an error cut-off of +8, though we should recognise from the above that the error cut-off will be useless over age 5. For low BMI the error cut-off needs to be lower than -8 and perhaps even -10.

For BMI there clearly need to be absolute cut-offs as well as z-score cut-offs, and The Guinness Book of Records seems to be a good source for them.

Incidentally, if BMI is flagged as aberrant, can the code distinguish between the three possibilities: height faulty, weight faulty, both faulty, or neither faulty? If so it would need feeding back with the error message.

Best wishes, Tim

Just one further point – the existence of the cap for BMI z-score arises directly from the LMS formula:

  Centile<100a> = M (1 + L S z<a>)^1/L

The bracketed term needs to be positive, otherwise it can’t be raised to a power (unless L = 1, e.g. for height). So in the limit

  1 + L S z<a> = 0

or

  z<a> = -1 / (L S)

This defines the value of the z-score cap, which corresponds to BMI = Infinity in the plot.

rcpch / rcpchgrowth-python

Api BMI response not returning any errors for invalid measurements #32

Height and weight validation limits

BMI validation limits