pharmaverse / admiralpeds

Admiral Package Extension for Pediatric Clinical Trials
https://pharmaverse.github.io/admiralpeds/
Apache License 2.0
13 stars 3 forks source link

closes #51 closes #74 closes #75 documentation increase unit test coverage for derive params growth age #67

Closed Minlei0201 closed 3 months ago

Minlei0201 commented 4 months ago

Thank you for your Pull Request! We have developed this task checklist from the Development Process Guide to help with the final steps of the process. Completing the below tasks helps to ensure our reviewers can maximize their time on your code as well as making sure the admiral codebase remains robust and consistent.

Please check off each taskbox as an acknowledgment that you completed the task or check off that it is not relevant to your Pull Request. This checklist is part of the Github Action workflows and the Pull Request will not be merged into the main branch until you have checked off each task.

Minlei0201 commented 4 months ago

Several notes/questions on the unit tests: 1). Test cases for patients < 2 years old not added yet, since there is no P95 and sigma in metadata (which are required to derive extended BMI SDS and percentiles). - Not needed since extended BMI SDS and percentile only applies to CDC (patients >= 2 years old) 2). Test cases for out of bound age not added yet: the expected result for such case should be missing SDS and percentiles, correct? - Yes 3). Test cases for missing anthropometric values not added yet: the expected result for such case should be missing SDS and percentiles, correct? - Yes 4). Is there any differences in handling length vs. height? If so, such test cases will be included as well. - No, we rely on users to preprocess the data if the values are in length. For unit test, we don't need test cases with length.

rossfarrugia commented 4 months ago

@Minlei0201 the R CMD CI check fail looks to be as some of the tests expected and actual don't match (see https://github.com/pharmaverse/admiralpeds/actions/runs/9840494918/job/27164859357?pr=67). Maybe its because this update needs to be merged first and then hopefully they'll match: https://github.com/pharmaverse/admiralpeds/pull/60

Minlei0201 commented 4 months ago

@Minlei0201 the R CMD CI check fail looks to be as some of the tests expected and actual don't match (see https://github.com/pharmaverse/admiralpeds/actions/runs/9840494918/job/27164859357?pr=67). Maybe its because this update needs to be merged first and then hopefully they'll match: #60

Hi @rossfarrugia , do you mean creating a PR and merge #60 to #51 ? Currently there are some conflict in the unit testing file before I can merge. @zdz2101 , maybe you can remove this file so that I can merge your branch into mine? https://github.com/pharmaverse/admiralpeds/blob/59-derive_params_growth_age-add-interpolation/tests/testthat/test-derive_params_growth_age.R

rossfarrugia commented 4 months ago

Hi @rossfarrugia , do you mean creating a PR and merge #60 to #51 ? Currently there are some conflict in the unit testing file before I can merge. @zdz2101 , maybe you can remove this file so that I can merge your branch into mine? https://github.com/pharmaverse/admiralpeds/blob/59-derive_params_growth_age-add-interpolation/tests/testthat/test-derive_params_growth_age.R

@Minlei0201 no, don't merge Zelos's branch yet as its still under review. My thoughts were let's get his reviewed/approved and then merged to main, and then you could run a git merge main in this branch. It will lead to merge conflicts for the unit test script but then you could make a commit to blend yours and Zelos's updates for the unit tests. I think some of the tests he added likely complement your checks, so i think there is value in combining unless you think all your tests already cover everything Zelos added to his branch.

Minlei0201 commented 4 months ago

Hi @rossfarrugia , do you mean creating a PR and merge #60 to #51 ? Currently there are some conflict in the unit testing file before I can merge. @zdz2101 , maybe you can remove this file so that I can merge your branch into mine? https://github.com/pharmaverse/admiralpeds/blob/59-derive_params_growth_age-add-interpolation/tests/testthat/test-derive_params_growth_age.R

@Minlei0201 no, don't merge Zelos's branch yet as its still under review. My thoughts were let's get his reviewed/approved and then merged to main, and then you could run a git merge main in this branch. It will lead to merge conflicts for the unit test script but then you could make a commit to blend yours and Zelos's updates for the unit tests. I think some of the tests he added likely complement your checks, so i think there is value in combining unless you think all your tests already cover everything Zelos added to his branch.

Hi @rossfarrugia , I see. So the plan for me is to incorporate the tests from Zelos' branch, inform @zdz2101 to incorporate the most recent unit test code from #51 (to avoid conflict in the next step), and merge main into #51 after Zelos' #60 merge into main. Does that sound reasonable?

zdz2101 commented 4 months ago

Hi @rossfarrugia , do you mean creating a PR and merge #60 to #51 ? Currently there are some conflict in the unit testing file before I can merge. @zdz2101 , maybe you can remove this file so that I can merge your branch into mine? https://github.com/pharmaverse/admiralpeds/blob/59-derive_params_growth_age-add-interpolation/tests/testthat/test-derive_params_growth_age.R

@Minlei0201 no, don't merge Zelos's branch yet as its still under review. My thoughts were let's get his reviewed/approved and then merged to main, and then you could run a git merge main in this branch. It will lead to merge conflicts for the unit test script but then you could make a commit to blend yours and Zelos's updates for the unit tests. I think some of the tests he added likely complement your checks, so i think there is value in combining unless you think all your tests already cover everything Zelos added to his branch.

Hi @rossfarrugia , I see. So the plan for me is to incorporate the tests from Zelos' branch, inform @zdz2101 to incorporate the most recent unit test code from #51 (to avoid conflict in the next step), and merge main into #51 after Zelos' #60 merge into main. Does that sound reasonable?

This would work, might even be easier to instead of targeting the main branch to merge you can target 59-derive_params_growth_age-add-interpolation , the branch containing the new function changes. We can merge yours in beforehand, then once PR #60 is ready to merge it'll also already contain your unit-tests

Minlei0201 commented 4 months ago

@Minlei0201 If you take a look at the "new" test suite in #60 you'll see I've added multiple visits per patient here and there to demonstrate clearly how the by_vars argument needs to be used, if you want to just throw in a couple more lines for a particular subject, it would closely match mine

Hi @zdz2101 , I see, makes sense. I will incorporate these tests to my testing tomorrow.

Minlei0201 commented 4 months ago

Hi @rossfarrugia , do you mean creating a PR and merge #60 to #51 ? Currently there are some conflict in the unit testing file before I can merge. @zdz2101 , maybe you can remove this file so that I can merge your branch into mine? https://github.com/pharmaverse/admiralpeds/blob/59-derive_params_growth_age-add-interpolation/tests/testthat/test-derive_params_growth_age.R

@Minlei0201 no, don't merge Zelos's branch yet as its still under review. My thoughts were let's get his reviewed/approved and then merged to main, and then you could run a git merge main in this branch. It will lead to merge conflicts for the unit test script but then you could make a commit to blend yours and Zelos's updates for the unit tests. I think some of the tests he added likely complement your checks, so i think there is value in combining unless you think all your tests already cover everything Zelos added to his branch.

Hi @rossfarrugia , I see. So the plan for me is to incorporate the tests from Zelos' branch, inform @zdz2101 to incorporate the most recent unit test code from #51 (to avoid conflict in the next step), and merge main into #51 after Zelos' #60 merge into main. Does that sound reasonable?

This would work, might even be easier to instead of targeting the main branch to merge you can target 59-derive_params_growth_age-add-interpolation , the branch containing the new function changes. We can merge yours in beforehand, then once PR #60 is ready to merge it'll also already contain your unit-tests

Hi @zdz2101 , sounds good. Could you let me know when #59 is ready to merge?

rossfarrugia commented 4 months ago

@zdz2101 @Minlei0201 i merged https://github.com/pharmaverse/admiralpeds/pull/60 as me and @Fanny-Gautier had approved it today and that should make easier with this one, as you can merge main to here and just focus on resolving the unit test program merge conflicts here in this one branch.

Minlei0201 commented 4 months ago

Hi @rossfarrugia , @zdz2101 , I've merged main into #51 , and updated unit test to include byvar test cases. Currently there are a few mismatches/questions I'd like to discuss with you:

  1. The FL code doesn't produce SDS and percentile when patient's age is in month (see example in Test 1). Per our previous discussion, if patient's age is reported in months, it will be converted and rounded to the nearest age in days by multiplying 30.4375, and then apply the WHO/interpolated CDC data, is that correct?

  2. For out of bound age and/or missing anthropometric measures, if the user requests both SDS and percentile, to my understanding, the function is supposed to return two records with missing values (i.e., one for SDS and one for percentile). However, currently the function only returns one record with missing value (see Test 6 and 7). Is that what you expect?

rossfarrugia commented 4 months ago

@zdz2101 please do add anything, but my understanding of the 2 questions is:

  1. No age conversion happens in the function. If you supply age in months in the data then you should also supply age in months in the metadata. In our template and vignette we make the metadata consistently in days via conversion and interpolation for CDC, and then we derive current age in days. Some other company could choose to only supply metadata in months and therefore derive current age in months, and the function should still work (always matching to the nearest when falling in between).
  2. I would agree to have both parameters created as missing would make sense to me. Maybe this is an update needed to the function?
Minlei0201 commented 4 months ago

@zdz2101 please do add anything, but my understanding of the 2 questions is:

  1. No age conversion happens in the function. If you supply age in months in the data then you should also supply age in months in the metadata. In our template and vignette we make the metadata consistently in days via conversion and interpolation for CDC, and then we derive current age in days. Some other company could choose to only supply metadata in months and therefore derive current age in months, and the function should still work (always matching to the nearest when falling in between).
  2. I would agree to have both parameters created as missing would make sense to me. Maybe this is an update needed to the function?

Hi @rossfarrugia , thanks for your clarification!

  1. I've changed the age unit in days for the majority of tests, and added one test case for age in months.
  2. @zdz2101 , please let me know when you update the FL codes.
  3. One additional comment, the FL code does not seem to derive the extended BMI SDS and percentile for BMI > 95%, per the formula in the guidance: https://www.cdc.gov/growthcharts/extended-bmi-data-files.htm (see Test 5). @zdz2101 , could you look into it and let me know if there is any misunderstanding from my end?
rossfarrugia commented 4 months ago

@zdz2101 please can you re-review this and if you agree with the suggested function fixes from Laura then please make a commit in this branch and we can fix and test in one place.

github-actions[bot] commented 4 months ago

Code Coverage

Package Line Rate Health
admiralpeds 99%
Summary 99% (269 / 273)
zdz2101 commented 4 months ago
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(rlang)
devtools::install_github("pharmaverse/admiralpeds", ref = "51-documentation-increase-unit-test-coverage-for-derive_params_growth_age")
#> Downloading GitHub repo pharmaverse/admiralpeds@51-documentation-increase-unit-test-coverage-for-derive_params_growth_age
#> 
#> ── R CMD build ─────────────────────────────────────────────────────────────────
#> * checking for file ‘/tmp/RtmpU311IW/remotes17d76f4de89db8/pharmaverse-admiralpeds-c8bb9a5/DESCRIPTION’ ... OK
#> * preparing ‘admiralpeds’:
#> * checking DESCRIPTION meta-information ... OK
#> * checking for LF line-endings in source and make files and shell scripts
#> * checking for empty or unneeded directories
#> * building ‘admiralpeds_0.1.0.9025.tar.gz’
#> Warning: invalid uid value replaced by that for user 'nobody'
#> Warning: invalid gid value replaced by that for user 'nobody'
#> Installing package into '/home/zelos.zhu/R/x86_64-pc-linux-gnu-library/4.3'
#> (as 'lib' is unspecified)
library(admiralpeds)

vs_data <- tibble::tribble(
  ~STUDYID, ~USUBJID, ~VISIT, ~SEX, ~AGECUR, ~AGEU, ~VSTESTCD, ~VSSTRESN,
  "Study", "1001", "Screening", "F", 196.5, "months", "BMI", 58.4,
)

meta <- cdc_bmiage %>% mutate(AGEU = "months", SEX = ifelse(SEX == 1, "M", "F"))

actual <- derive_params_growth_age(
  dataset = vs_data,
  by_vars = exprs(STUDYID, USUBJID, VISIT),
  sex = SEX,
  age = AGECUR,
  age_unit = AGEU,
  meta_criteria = meta,
  parameter = VSTESTCD == "BMI",
  bmi_cdc_correction = TRUE,
  analysis_var = VSSTRESN,
  set_values_to_sds = exprs(
    PARAMCD = "BMIASDS"
  ),
  set_values_to_pctl = exprs(
    PARAMCD = "BMIAPCTL"
  )
)
actual
#> # A tibble: 3 × 10
#>   STUDYID USUBJID VISIT     SEX   AGECUR AGEU   VSTESTCD VSSTRESN   AVAL PARAMCD
#>   <chr>   <chr>   <chr>     <chr>  <dbl> <chr>  <chr>       <dbl>  <dbl> <chr>  
#> 1 Study   1001    Screening F       196. months BMI          58.4  NA    <NA>   
#> 2 Study   1001    Screening F       196. months BMI          58.4   4.89 BMIASDS
#> 3 Study   1001    Screening F       196. months BMI          58.4 100.   BMIAPC…

Created on 2024-07-16 with reprex v2.1.0

@CDC-DNPAO The package version within this branch contains the proper fix to your extended BMI inquiry, there was indeed a mistake found that should be addressed now, if we haven't merged it into main before any other inquires you have, you can access this version of the package using

devtools::install_github("pharmaverse/admiralpeds", ref = "51-documentation-increase-unit-test-coverage-for-derive_params_growth_age")

Minlei0201 commented 4 months ago

@zdz2101 please do add anything, but my understanding of the 2 questions is:

  1. No age conversion happens in the function. If you supply age in months in the data then you should also supply age in months in the metadata. In our template and vignette we make the metadata consistently in days via conversion and interpolation for CDC, and then we derive current age in days. Some other company could choose to only supply metadata in months and therefore derive current age in months, and the function should still work (always matching to the nearest when falling in between).
  2. I would agree to have both parameters created as missing would make sense to me. Maybe this is an update needed to the function?

Hi @rossfarrugia , thanks for your clarification!

  1. I've changed the age unit in days for the majority of tests, and added one test case for age in months.
  2. @zdz2101 , please let me know when you update the FL codes.
  3. One additional comment, the FL code does not seem to derive the extended BMI SDS and percentile for BMI > 95%, per the formula in the guidance: https://www.cdc.gov/growthcharts/extended-bmi-data-files.htm (see Test 5). @zdz2101 , could you look into it and let me know if there is any misunderstanding from my end?

Hi @zdz2101 , any update on number 2 above, regarding creating separate SDS and percentile records (eg, for out of bound age, or missing anthropometric values)?

CDC-DNPAO commented 4 months ago

Minlei - I stink at leaving comments on github. Could you send me your email, and I'll make sure that I include you when I send Ross and Zelos an email.

thanks, david

On Tue, Jul 16, 2024 at 9:59 PM Minlei0201 @.***> wrote:

@.**** commented on this pull request.

In tests/testthat/test-derive_params_growth_age.R https://github.com/pharmaverse/admiralpeds/pull/67#discussion_r1680304571 :

  • set_values_to_sds = exprs(
  • PARAMCD = "WGASDS",
  • PARAM = "Weight-for-age z-score"
  • ),
  • set_values_to_pctl = exprs(
  • PARAMCD = "WGAPCTL",
  • PARAM = "Weight-for-age percentile"
  • )
  • )
  • SD2pos <- (9.0342 (1 + 20.0868*0.10885) ^ (1/0.0868))
  • SD3pos <- (9.0342 (1 + 30.0868*0.10885) ^ (1/0.0868))
  • expected_sds <- 3 + (20.4 - SD3pos)/(SD3pos - SD2pos)
  • expected_pctl <- pnorm(expected_sds)*100
  • expect_equal(

Hi @zdz2101 https://github.com/zdz2101 , can you remind me which email you were referring to? I searched my inbox, but didn't find an email from David on right skewness adjustment.

— Reply to this email directly, view it on GitHub https://github.com/pharmaverse/admiralpeds/pull/67#discussion_r1680304571, or unsubscribe https://github.com/notifications/unsubscribe-auth/AY2XGPSEKA4GZJ5MMLNOIG3ZMXFZ3AVCNFSM6AAAAABKQD7VICVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDCOBRG4YTQMJZHE . You are receiving this because you were mentioned.Message ID: @.***>

Minlei0201 commented 4 months ago

Minlei - I stink at leaving comments on github. Could you send me your email, and I'll make sure that I include you when I send Ross and Zelos an email. thanks, david On Tue, Jul 16, 2024 at 9:59 PM Minlei0201 @.> wrote: @*.* commented on this pull request. ------------------------------ In tests/testthat/test-derive_params_growth_age.R <#67 (comment)> : > + set_values_to_sds = exprs( + PARAMCD = "WGASDS", + PARAM = "Weight-for-age z-score" + ), + set_values_to_pctl = exprs( + PARAMCD = "WGAPCTL", + PARAM = "Weight-for-age percentile" + ) + ) + + SD2pos <- (9.0342 (1 + 20.08680.10885) ^ (1/0.0868)) + SD3pos <- (9.0342 (1 + 30.08680.10885) ^ (1/0.0868)) + expected_sds <- 3 + (20.4 - SD3pos)/(SD3pos - SD2pos) + expected_pctl <- pnorm(expected_sds)100 + + expect_equal( Hi @zdz2101 https://github.com/zdz2101 , can you remind me which email you were referring to? I searched my inbox, but didn't find an email from David on right skewness adjustment. — Reply to this email directly, view it on GitHub <#67 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AY2XGPSEKA4GZJ5MMLNOIG3ZMXFZ3AVCNFSM6AAAAABKQD7VICVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDCOBRG4YTQMJZHE . You are receiving this because you were mentioned.Message ID: *@*.***>

Hi @CDC-DNPAO , thank you! Ross has forwarded me your email - I'll review the codes based on your email later today.

rossfarrugia commented 4 months ago

@zdz2101 i made a commit to fix a number of the remaining comments - please double check.

from what i can make of it now we just have 2 things left to fix here:

CDC-DNPAO commented 4 months ago

These L and S values are not hard-coded into the program, are they? I'm hoping this is simply an example for one child. When you calculate SD2, SD2, etc., you've got to use the WHO L and S values for a child of that specific sex and age.

On Thu, Jul 18, 2024 at 4:17 AM Ross Farrugia @.***> wrote:

@.**** commented on this pull request.

In tests/testthat/test-derive_params_growth_height.R https://github.com/pharmaverse/admiralpeds/pull/67#discussion_r1682414355 :

  • PARAMCD = "WGHSDS",
  • PARAM = "Weight-for-height Z-Score"
  • ),
  • set_values_to_pctl = exprs(
  • PARAMCD = "WGHPCTL",
  • PARAM = "Weight-for-height percentile"
  • )
  • )
  • sd2pos <- (3.3278 (1 + 2 -0.3521 * 0.08890)^(1 / -0.3521))
  • sd3pos <- (3.3278 (1 + 3 -0.3521 * 0.08890)^(1 / -0.3521))
  • sd2neg <- (3.3278 (1 - 2 -0.3521 * 0.08890)^(1 / -0.3521))
  • sd3neg <- (3.3278 (1 - 3 -0.3521 * 0.08890)^(1 / -0.3521))
  • expected_sds <- c(
  • 3 + (5 - sd3pos) / (sd3pos - sd2pos),
  • -3 + (2 - sd3neg) / (sd2neg - sd3neg)

Zelos - i'll make a commit to update this and please check you agree

— Reply to this email directly, view it on GitHub https://github.com/pharmaverse/admiralpeds/pull/67#discussion_r1682414355, or unsubscribe https://github.com/notifications/unsubscribe-auth/AY2XGPUO3VPDZDET7KSM5I3ZM52Z5AVCNFSM6AAAAABKQD7VICVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDCOBVGA3TAMZQGE . You are receiving this because you were mentioned.Message ID: @.***>

rossfarrugia commented 4 months ago

These L and S values are not hard-coded into the program, are they? I'm hoping this is simply an example for one child. When you calculate SD2, SD2, etc., you've got to use the WHO L and S values for a child of that specific sex and age.

@CDC-DNPAO that one's a unit test program. It's not part of the function code, it is just used to test the results are reliable, so we feed in test data values and you check that the actual outcomes match what we expect.

CDC-DNPAO commented 4 months ago

Sorry, I omitted the hyphen in my search of the WHO document.

But the sentences following your png PDF say:

'LMS method fits skewed data adequately by using a Box-Cox normal distribution, which follows the empirical data closely. The drawback, however, is that the outer tails of the distribution are highly affected by extreme data points even if only very few (e.g. less than 1%). A restricted application of the LMS method was thus used for the construction of the WHO weight-based indicators, limiting the Box-Cox normal distribution to the interval corresponding to z-scores where empirical data were available (i.e. between -3 SD and 3 SD).'

I would change your 'account for right-skewness' to 'who_adjustment weight-based calculations'. I agree that users don't need to know my version of the history of this.

david

On Fri, Jul 19, 2024 at 10:43 AM Zelos Zhu @.***> wrote:

@.**** commented on this pull request.

In R/derive_params_growth_age.R https://github.com/pharmaverse/admiralpeds/pull/67#discussion_r1684490143 :

@@ -72,6 +72,12 @@

' CDC developed extended percentiles (>95%) to monitor high BMI values,

' if set to TRUE the CDC's correction is applied.

'

+#' @param who_correction Right skew correction +#' +#' A logical scalar, e.g. TRUE/FALSE is expected. +#' WHO developed a modification to the z-score calculation to accommodate for right-skewness +#' in certain data, if set to TRUE the WHO correction is applied.

I saw the phrase right-skewness somewhere on pg 302 before the formula, image.png (view on web) https://github.com/user-attachments/assets/4948f9a3-bf0e-4a1c-8f3f-5015749f5878

I'll remove any instances of the word "right skewnesss" and simply call it the "WHO adjustment" I don't think it's necessary for users to know the history of the decision made

— Reply to this email directly, view it on GitHub https://github.com/pharmaverse/admiralpeds/pull/67#discussion_r1684490143, or unsubscribe https://github.com/notifications/unsubscribe-auth/AY2XGPUZ6UGAFVLL63WADDLZNEQZZAVCNFSM6AAAAABKQD7VICVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDCOBYGQYTKNZZG4 . You are receiving this because you were mentioned.Message ID: @.***>