rcpch / rcpchgrowth-python

A python package to produce calculations for all growth issues
GNU Affero General Public License v3.0
5 stars 4 forks source link

Thrive lines #15

Closed eatyourpeas closed 2 years ago

eatyourpeas commented 4 years ago

Interpreting growth velocity for different parameters is difficult because it is different at different ages. Tim Cole has published previously the concept of 'thrive lines' which graph growth velocity centiles for different ages, and suggested they be developed into acetate overlays on top of the papercharts. This could reasonably be imagined well electronically.

statist7 commented 4 years ago

Thrive lines provide a way to assess centile crossing for data already plotted on a growth chart. The plastic overlay flags growth curves that are crossing centiles too rapidly, up or down.

A simpler but related approach is to express an individual's change in z-score over time as a conditional velocity centile. It formally tests whether the degree of centile crossing is of concern. This can easily be calculated by the API when there are two or more measurement occasions.

I'm not sure how useful thrive lines are for assessing individual growth curves on screen, as they are designed for paper charts where the data are already plotted.

eatyourpeas commented 4 years ago

Absolutely @statist7 . I had envisaged lines which could toggle on and off, overlaid on the chart, rather like the acetates. The conditional velocity centile I see as a separate innovation and should be opened as a separate issue. I will do this.

statist7 commented 3 years ago

Is now a good time to think of revisiting thrive lines? Where does the API stand with multiple measurement occasions?

eatyourpeas commented 3 years ago

I definitely have an appetite to have another crack at it. We should probably set up another call to run through it again.

We have commented out the endpoint that accepts multiple values but we could reanimate it in a development branch and play with it.

pacharanero commented 3 years ago

I would say that the API could probably have this added soon, yes. I probably could do with a call to have them explained again please because I have not used such tools in clinical practice and want to ensure we implement them in the best way.

I would say that the clean, technically neat way to do this is NOT to add a Multiple Measurement endpoint. I've said MANY times that this will hugely complicate the commercial pricing of the API and we really must not do it.

How to do thrive lines

Questions

eatyourpeas commented 3 years ago

Rather than returning multiple measurements, perhaps it could accept multiple measurements and these are used to calculate trends / acceleration / velocity / thrive centile etc

On 8 Sep 2021, at 14:47, Marcus Baw @.***> wrote:

I would say that the API could probably have this added soon, yes. I probably could do with a call to have them explained again please because I have not used such tools in clinical practice and want to ensure we implement them in the best way.

I would say that the clean, technically neat way to do this is NOT to add a Multiple Measurement endpoint. I've said MANY times that this will hugely complicate the commercial pricing of the API and we really must not do it.

How to do thrive lines

Add thrive lines calculation to the RCPCHGrowth Python package Create a new API endpoint at something like POST {[baseUrl}}/growth/v1/utilities/thrive-lines and integrate the Python The end-user system would have an array of previously persisted MeasurementObjects, and it would pass an array of theseMeasurementObjects BACK into the thrive-lines API endpoint The system would receive a return ThriveLine object with plottable thrive line data It would pass this into the React Chart Component (which would be extended to be able to understand and display the thrive lines) which would display them. Questions

Are thrive lines specific to the UK-WHO references or does the same functionality apply equally to all references? (this determines where int he API namespace it should go) Are there yet any standards or thought about how thrive lines should look? Should this be something we ask the Project board to opine on? — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/rcpch/digital-growth-charts-server/issues/24#issuecomment-915254563, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAICOVZHWL76FC6CICTEPSTUA5SOLANCNFSM4N3WTFCA.

pacharanero commented 3 years ago

That is exactly what I'm proposing, yes.

M

On Wed, 8 Sept 2021 at 14:50, Simon Chapman @.***> wrote:

Rather than returning multiple measurements, perhaps it could accept multiple measurements and these are used to calculate trends / acceleration / velocity / thrive centile etc

On 8 Sep 2021, at 14:47, Marcus Baw @.***> wrote:

I would say that the API could probably have this added soon, yes. I probably could do with a call to have them explained again please because I have not used such tools in clinical practice and want to ensure we implement them in the best way.

I would say that the clean, technically neat way to do this is NOT to add a Multiple Measurement endpoint. I've said MANY times that this will hugely complicate the commercial pricing of the API and we really must not do it.

How to do thrive lines

Add thrive lines calculation to the RCPCHGrowth Python package Create a new API endpoint at something like POST {[baseUrl}}/growth/v1/utilities/thrive-lines and integrate the Python The end-user system would have an array of previously persisted MeasurementObjects, and it would pass an array of theseMeasurementObjects BACK into the thrive-lines API endpoint The system would receive a return ThriveLine object with plottable thrive line data It would pass this into the React Chart Component (which would be extended to be able to understand and display the thrive lines) which would display them. Questions

Are thrive lines specific to the UK-WHO references or does the same functionality apply equally to all references? (this determines where int he API namespace it should go) Are there yet any standards or thought about how thrive lines should look? Should this be something we ask the Project board to opine on? — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub < https://github.com/rcpch/digital-growth-charts-server/issues/24#issuecomment-915254563>, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAICOVZHWL76FC6CICTEPSTUA5SOLANCNFSM4N3WTFCA .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/rcpch/digital-growth-charts-server/issues/24#issuecomment-915257138, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAR5KYIH4Z7OZ5VAFU6H2TTUA5S2NANCNFSM4N3WTFCA .

statist7 commented 3 years ago

I've just submitted a book chapter which has a section on thrive lines, attached here. It doesn't have any of the algebra though. Extract on thrive lines.docx

To answer your questions @pacharanero:

We need to think what the API will return. It could return sufficient information to draw the thrive lines, or alternatively the plotting could be viewed as the client's responsibility. An alternative would be for the API to return the velocity z-scores as calculated between pairs of measurements, with the pairs either restricted or not restricted to be adjacent. For example with measurements at months 1 to 6, one could return just 1-2, 2-3, 3-4 etc, or in addition 1-3, 2-4, 3-6 etc.

statist7 commented 3 years ago

Here are skeleton specs for two functions needed to plot thrive lines:

  1. r = getcor(t1, t2, cormat) returns the correlation between measurements at ages t1 and t2, as obtained by interpolating within correlation matrix cormat.

  2. z = thriveline(t, z1, zv, cormat) returns a vector of n z-scores at ages specified by ordered n-vector t, where z1 is the first z-score and zv is the required velocity z-score. The second and later z-scores are defined by the recurrence relation z<i> = z<i-1> * r + zv * sqrt(1 - r^2) for i = 2...n where r = getcor(t<i-1>, t<i>, cormat).

To draw the thrive lines plot z against t for thriveline called multiple times over a range of z1 values. The values of t would typically be equally spaced, and could alternatively be defined as a start and end age and a fixed interval between ages.

eatyourpeas commented 3 years ago

Moving this issue into the python package, more its natural home I think. After some silence I have an initial implementation which has generated this, temporarily graphed using pyplot. I am not sure it is quite what you envisaged and have maybe made a mistake. image The code is here in the thrive_lines branch in dynamic_growth.py There are 7 functions that begin on line 129. The main function to focus on is: def create_thrive_line(t: list, z1: float, target_centile: float = 5.0): where t is a list of ages in intervals > 0 and < 1, z1 is the starting SDS, and the target_centile is the velocity centile requested (equivalent to zv in your example above). The function increments through the list of ages in steps of 2, using each pair of ages to look up the conditional velocity from the correlation matrix. Where there is no exact match, bilinear interpolation is used. This in turn is plugged into the conditional_weight_gain equation provided above (and in the paper you provided) to generate a z score which is in turn used to calculate a measurement using the UK-WHO LMS tables. The measurements are collected in a list and plotted against age. The lines are somewhat uneven but the intervals I have used are monthly, and I also used the correlation index in months, rather than the one in weeks, largely to keep things simple in the first instance. @statist7 let me know if I have understood the process correctly or if I have somehow missed a step.

statist7 commented 3 years ago

This looks promising. Looking at the code I'm guessing you've used the default 5th velocity centile, and it appears similar to my own 5th centile thrive lines.

You are right that the curves are not smooth, which I think is inevitable, and I smooth them to improve the appearance.

A refinement would be to restrict the thrive lines to within the range of the weight centiles at each age.

A minor inconsistency, you use both target_centile and centile_target in different places.

eatyourpeas commented 3 years ago

I have refactored target_centile The smoothing I see you use a spline function which must be one of the R libraries. There will undoubtedly be something equivalent in scipy. I have put in the centile lines and trimmed the thrive lines to be within the margins as you say but they still don't look right: image EDIT: here they are smoothed: image

statist7 commented 3 years ago

You are correct, they don't look right. Which velocity centile are they?

Have you got the same time units for the centiles and the correlations?

We probably need a session to discuss - it would need to be 4 November or later for me.

eatyourpeas commented 3 years ago

I think maybe I am somewhat closer. These are 5th velocity centiles over one month for weights in boys. They are still not as beautiful as the ones you produced in R, so we should keep that meeting on the 10th if that is ok to tidy. My methodology is tighter this time - I think I missed a step before. For a collection of lines, the steps I go though are:

  1. create a list of ages 0-12 months each a month apart (t)
  2. set up a loop that increments a starting SD(z) which is used to create a thrive line on every pass (by calling the create_thrive_line function); the starting SD value for each line increments by 0.67 SD, starting at -20, ending at +20. Pass into the create_thrive_line function the list of ages, the sex (boys here), the starting z(z1) and the target_centile (default 5th)
  3. Each line returned is smoothed using a natural spline and rendered over the centile chart (this smoothing step currently breaks it, I think because of the nonetypes)

The steps for each thrive line are:

  1. Each thrive line starts as an empty list, initialised with the SD passed in from above which forms z1 at t1 (first value in list t).
  2. The correlation (r) between the first and second ages (t1 and t2) in list t is found using the correlation matrix
  3. z2 is then calculated using r, z1 and the velocity z requested (zv - defaults to -1.644 [5th centile] if not supplied)
  4. z2 is then added to the list at t2.
  5. The process repeats, stepping through the list of ages, using the current and the next age to calculate the next z, until values for z have been calculated for each age in the list.
  6. If any zs are outside of the upper or lower limits of the centile chart these are discarded.
  7. These zs are then converted to weights in the standard way using the LMS tables.
  8. The final list of weights and ages is returned above for smoothing and plotting

Just for my peace of mind, I would be grateful if you could check that these steps are broadly correct.

image

statist7 commented 3 years ago

Excellent! - that looks really good.

A refinement I developed was getting the thrive lines to stop on, rather than inside, the outer centiles. This way the range of the thrive lines defines the range of normal growth between the 0.4th and 99.6th centiles.

eatyourpeas commented 3 years ago

A step closer. Nearly there. image

statist7 commented 3 years ago

That looks really good.

Just a couple of queries/comments. Why do the thrive lines (largely) stop at age ~0.8 rather than 1.0?

And I note that the starting z-score values to need extend downwards, to fill in the low triangular region at birth.

pacharanero commented 3 years ago

Great stuff @eatyourpeas !

eatyourpeas commented 2 years ago

You are right @statist7 that it does not quite work. I will close this now as we have implemented the feature but it is not ready for production yet and still needs work.

mbarton commented 2 months ago

Hijacking the old thread to save this forum post where a dev is asking for an API for thrive lines: https://developer.community.nhs.uk/t/apis-for-thrive-line/561