pharmaverse / admiral

ADaM in R Asset Library
https://pharmaverse.github.io/admiral
Apache License 2.0
215 stars 60 forks source link

Reconsider presence/need of AVALC in ADaMs with only numeric parameters #2442

Closed manciniedoardo closed 1 month ago

manciniedoardo commented 2 months ago

Background Information

Originally discussed here.

According to CDISC guidance (3.3.4.2 of the ADaMIG v1.3), ADaMs with only numeric parameters should not have AVALC as a variable. Currently {admiral} does not follow this guidance in ADVS (and maybe other ADaMs too).

@nbrucken17 tagging you for awareness - thanks for flagging. @Fanny-Gautier @Lina2689 tagging you in case this is relevant for {admiralpeds}. @kaz462 tagging you as this will require an update to the {pharmaverseadam} labelling spec if any vars are removed.

Definition of Done

bms63 commented 2 months ago

Could we try and get this out for the release? Any takers @pharmaverse/admiral @pharmaverse/admiral_comm

StefanThoma commented 1 month ago

I would agree with removing AVALC from the ADVS vignette: https://pharmaverse.github.io/admiral/articles/bds_finding.html

Also, I would remove it from the template, but would leave the part commented out probably, so if there was a value where AVALC is relevant this would be easy to put back in. WDYT @manciniedoardo ?

I'd be up to do this until next week. I will of course make sure CICD passes, but not sure how you mean: "are we breaking any [...] codelists?"

bms63 commented 1 month ago

Are you able to complete this issue this week? We have a release on June 2nd.

StefanThoma commented 1 month ago

@bms63 yes I think so, depending on whether everything works out :)

StefanThoma commented 1 month ago

Notes: ADLB does not need to be adjusted, as COLOR is non numeric in AVALC ADEG does not need to be adjusted, as some values are "NORMAL" in AVALC.

However, @bms63 & @manciniedoardo do you think that AVALC should be NA where it is just a character version of AVALN? The corresponding section of ADaMIG 1.3:

Screenshot 2024-05-29 at 10 24 10

StefanThoma commented 1 month ago

I guess my question is; Should AVALC (e.g. in ADEG) look more like this:

Screenshot 2024-05-29 at 10 26 12

Or more like this: Screenshot 2024-05-29 at 10 30 23

bms63 commented 1 month ago

I guess my question is; Should AVALC (e.g. in ADEG) look more like this:

Screenshot 2024-05-29 at 10 26 12

Or more like this: Screenshot 2024-05-29 at 10 30 23

Yes this is how it should be used.

StefanThoma commented 1 month ago

I guess my question is; Should AVALC (e.g. in ADEG) look more like this:

Screenshot 2024-05-29 at 10 26 12

Or more like this: Screenshot 2024-05-29 at 10 30 23

Yes this is how it should be used.

Which one of the two? This is not clear to me given the ADaM IG.

bms63 commented 1 month ago

The second one

nbrucken17 commented 1 month ago

If you're referring to the 2 screenshots that Stefan originally included (I'm not able to see the ones in the more recent posts), neither is correct, according to the ADaMIG. The only times that both AVAL and AVALC are expected to be populated on the same record are when there is a code-decode relationship between them, or when AVAL is used to control the sort order for AVALC. I would not expect to see AVALC populated with either the formatted character representation of a numeric value or with "NA" if the analysis value is numeric- it should be left blank in those cases.

Thanks, Nancy

StefanThoma commented 1 month ago

If you're referring to the 2 screenshots that Stefan originally included (I'm not able to see the ones in the more recent posts), neither is correct, according to the ADaMIG. The only times that both AVAL and AVALC are expected to be populated on the same record are when there is a code-decode relationship between them, or when AVAL is used to control the sort order for AVALC. I would not expect to see AVALC populated with either the formatted character representation of a numeric value or with "NA" if the analysis value is numeric- it should be left blank in those cases.

Thanks, Nancy

Hi Nancy,

Thanks for checking in, I am working on this one right now. I am not sure I understand correctly: In the second screenshot I populate the AVALC variable only because there is information is EGSTRESC that is not available in EGSTRESN: i.e. when EGSTRESC is "NORMAL". All other rows are left blank, represented here as NA. What output would you expect here?

nbrucken17 commented 1 month ago

Hi Stefan,

If the "NA" are intended to represent blanks, that's fine. When EGSTRESC=NORMAL, I'd expect AVALC to be "NORMAL", and AVAL to be left missing, unless there's a need to assign a numeric code to it for sorting or additional analysis. Conversely, if EGSTRESC contains an actual numeric measurement, I'd expect AVAL to be contain the numeric result and AVALC to be left missing.

Since "NA" is often stored as an actual character value, I'm concerned that people will see this and think that they have to populate both AVAL and AVALC all the time, which is a misconception that the ADaM team has been working to dispel for many years now. However, if "NA" is R's way of representing a missing value, that's fine, but then it will probably confuse any SAS programmers looking at this. If that's the case, could it be noted somewhere?

Thanks, Nancy

StefanThoma commented 1 month ago

Hi Stefan,

If the "NA" are intended to represent blanks, that's fine. When EGSTRESC=NORMAL, I'd expect AVALC to be "NORMAL", and AVAL to be left missing, unless there's a need to assign a numeric code to it for sorting or additional analysis. Conversely, if EGSTRESC contains an actual numeric measurement, I'd expect AVAL to be contain the numeric result and AVALC to be left missing.

Since "NA" is often stored as an actual character value, I'm concerned that people will see this and think that they have to populate both AVAL and AVALC all the time, which is a misconception that the ADaM team has been working to dispel for many years now. However, if "NA" is R's way of representing a missing value, that's fine, but then it will probably confuse any SAS programmers looking at this. If that's the case, could it be noted somewhere?

Thanks, Nancy

Hi Nancy,

Thanks again for the clarification, looks like it is working as intended. In R, NA represents a missing value. Please note that this is not the same as "NA", which would be a string, but is indeed missing. I don't think that this should be noted within the scope of admiral as it relates to the way R represents missing data in general.

All the best, Stefan