pharmaverse / admiral

ADaM in R Asset Library
https://pharmaverse.github.io/admiral
Apache License 2.0
227 stars 65 forks source link

Feature Request: add keep_vars parameter in `derive_locf_records()` #1636

Closed kaz462 closed 1 year ago

kaz462 commented 1 year ago

Feature Idea

Would it be possible to specify variables that needs to be kept from "Last Observation" for new records? currently "Only variables specified in derive_locf_records(..., by_vars) will be populated in the newly created records."

Relevant Input

derive_locf_records(..., keep_vars)

Relevant Output

No response

Reproducible Example/Pseudo Code

add PARAMN, PARAM, ADT, ADY in the advs from documentation example, and populate these columns in the newly created records.

bundfussr commented 1 year ago

How would you populate variables like ADT or ADY? Would you carry forward their values from the last valid observation or should they be set to the planned date or day? What would you do if there is an observation for a certain visit but AVAL is missing? Would you keep ADT and ADY from the this observation or use ADT and ADY from the last observation with non-missing AVAL?

kaz462 commented 1 year ago

Thanks, @bundfussr for the quick response! An example from CDISC pilot data adqsadas carried forward their values (ADT/ADY) from the last valid observation, but it does not necessarily apply to other studies. What do you think of adding the option to specify variables (other than AVAL) that need to be carried over from the last valid observation (with non-missing AVAL)?

The following ones could be better examples than ADT/ADY: VISITNUM, VISIT, PARAM, PARAMN, etc

kaz462 commented 1 year ago

I created a pull request to illustrate the documentation example (I added one more column PARAMN in the example data) : if PARAMN needs to be populated, we can specify it in keep_vars

derive_locf_records_(
  data = advs,
  dataset_expected_obs = advs_expected_obsv,
  by_vars = vars(STUDYID, USUBJID, PARAMCD),
  order = vars(AVISITN, AVISIT),
  keep_vars = vars(PARAMN)
)

In the original function, PARAMN cannot be specified in by_vars, as it will be treated as a grouping variable and return unexpected results (63 records, 23 records are expected)

derive_locf_records(
  data = advs,
  dataset_expected_obs = advs_expected_obsv,
  by_vars = vars(STUDYID, USUBJID, PARAMCD, PARAMN),
  order = vars(AVISITN, AVISIT)
)
bms63 commented 1 year ago

HI @gg106046 what do you think?

gg106046 commented 1 year ago

HI @gg106046 what do you think? This makes sense. I can try adding this feature!

bundfussr commented 1 year ago

@kaz462 , yes, I think adding keep_vars makes sense.