Closed rossfarrugia closed 3 months ago
From the above we understood to create in total 5 metadata sets from the 15 datasets as following:
By age, containing: a) _bmi_byage: BMI related data (WHO_bmi_for_age_boys.rda, WHO_bmi_for_age_girls.rda, cdc_bmiage.rda) + variable SEX (converting 1 to "M"/ 2 to "F" for cdc data, and adding "M" for who_boys data / "F" for who_girls data) + variable AGE (in days) b) _height_byage: HEIGHT related data (who_lgth_ht_for_age_boys.rda, who_lght_ht_for_age_girls.rda, cdc_htage.rda) + variable SEX (converting 1 to "M"/ 2 to "F" for cdc data, and adding "M" for who_boys data / "F" for who_girls data) + variable AGE (in days) c) _weight_byage: WEIGHT related data (who_wt_for_age_boys.rda, who_wt_for_age_girls.rda, cdc_wtage.rda) + variable SEX (converting 1 to "M"/ 2 to "F" for cdc data, and adding "M" for who_boys data / "F" for who_girls data) + variable AGE (in days)
WHO: HC and WT only: a) _whohc: HEAD CIRCUMFERENCE related data (who_hc_for_age_boys.rda, who_hc_for_age_girls.rda) which include the variable AGE in days + variable SEX (adding "M" for who_boys data / "F" for who_girls data) b) _who_wt_lgt: WEIGHT/HEIGHT related data (who_wt_for_ht_boys.rda, who_wt_for_ht_girls.rda, who_wt_for_lgth_boys.rda, who_wt_for_lgth_girls.rda) + variable MEASURE set to "HEIGHT" for wt_for_ht data, and set to "LENGTH" for wt_for_lgth data + variable SEX ("M" for boys data / "F" for girls data) + variables "Height" and "Length" renamed as the single one "HEIGHT_LENGTH"
Could you please confirm our understanding ?
Thank you.
@Fanny-Gautier yes, that's right, and some additional points:
_for_
instead of _by_
as that's how Pierre set up the WHO metafiles in our package.This should be done in a file named inst/templates/ad_advs.R
with the earlier generic ADVS steps copied from https://github.com/pharmaverse/admiral/blob/main/inst/templates/ad_advs.R as per the outline doc on MS teams.
is 2 years <731 days or <= 729 ? In the ADVS specs we talk about <731 days... 30.4375 x 24 = 730.5 , knowing that it is strictly under 2 years old, I would have keep till 730 days included. Please confirm.
Converted months in days gives 730.5. So for WHO data we would have till 729 days included: and for CDC data it's from 730.5 days:
Is it correct this way? in that case could you please clarify the ADVS specs where <731 days is mentioned ? thanks.
is 2 years <731 days or <= 729 ? In the ADVS specs we talk about <731 days... 30.4375 x 24 = 730.5 , knowing that it is strictly under 2 years old, I would have keep till 730 days included. Please confirm.
@Fanny-Gautier given to account for leap year we usually assume a year is 365.25 days so that means we should include up to 730 days from WHO and 731 days onwards from CDC. sorry for confusing above with the mention of 729
ADVS template: could you please guide us for the merge of ADSL variables (TRTSDT) since these are not the same patients as in dm_peds ? I am not sure to understand how to proceed with step 3 (Computing AAGECUR) and merge ADSL variables from step 1b.
@Fanny-Gautier i spotted same today (see discussion at https://github.com/pharmaverse/pharmaversesdtm/issues/88). for now just use the dm_peds
directly (and ignore mention of ADSL
). we might need to come back to this later and update the test data and run a new ADSL
from it to use, as we usually try to extend the same CDISC test pilot study data as in {pharmaversesdtm}.
To compute AAGECUR, do we impute the partial Birth Dates ? If so, which approach is more appropriate for pediatric patients ? is birth date not before a list of minimum dates ok ? Also for partial Analysis date if any, do we impute to first day of the month ?
Regarding Birth date imputation, I am not sure why BRTHDT is not found while well merged as first step to VS:
I wouldn't impute partial birth dates for the template - let's keep this example simple. I saw the test data has one such case but think we could update this later to make all dates complete. I'd expect most trials have complete birth dates and I wouldn't want to estimate what the appropriate imputation rule might be if they didn't.
For the issue with the code, feel free to email me the full code you're running and I could try and figure out on my side. I don't see anything obvious from your screenshot.
Delete line of code mutate(AVALC = VSSTRESC) as per Nancy's comment
See the outline document in MS Teams area. Steps 5 and 6 rely on the new functions being available so placeholders will be needed for now.
The main part will be Step 4 for preparing the metadata into a consistent structure covering the below points:
PROPOSAL FOR BY AGE: In this step for each of BMI/HEIGHT/WEIGHT, we would combine WHO <2yrs and CDC >=2 data into one single consistently structured meta dataset (with
SEX
as a variable - withM
andF
values,AGE
in days via a conversion factor). Clearly comment to make clear here our default reference source, in our case WHO <2 yrs and CDC >=2, and the fact that any user could replace this with their own chosen metadata. We can explain this then more in the vignette.PROPOSAL FOR THE WHO-ONLY DATA: In this step for each of HEAD CIRCUMFERENCE and WEIGHT (BY LENGTH/HEIGHT), we would combine the boys and girls data into one single consistently structured meta dataset (with
SEX
as a variable - withM
andF
values). For WEIGHT (BY LENGTH/HEIGHT) additionally then combine the LENGTH and HEIGHT data into one single dataset with additional variableMEASURE
to denote LENGTH or HEIGHT andHEIGHT_LENGTH
variable showing the length/height values.The naming of these meta datasets should still clearly convey which parameter they each relate to, so then no need to add PARAMCD here as a variable in these datasets.