Closed rammprasad closed 1 month ago
As per discussion with @rammprasad I will implement instead a separate function to mark records in tibbles for filtering. The new function will be: condition_by()
.
As per discussion with @rammprasad I will implement instead a separate function to mark records in tibbles for filtering. The new function will be:
condition_by()
.
Shall we name the function add_cond() or add_condition()?
The example will look like below.
Example If conditions
If [AESOS.AESO] == 1 and [AESOS.AESOSP] is null then hardcode OE.OEORRES = 'Y'
AESOS is the raw dataset and AESO, AESOSP are variables in the raw dataset. OE is the target domain and OEORRES is the target variable.
hardcode_no_ct(
raw_dat = add_cond(AESOS, AESO == 1 && !is.na(AESOSP)),
raw_var = AESO,
tgt_var = OEORRES,
tgt_val = "Y",
tgt_dat = OE_INTER,
id_vars = oak_id_vars()
)
If [AESOS.AESO] == 1 and [AESOS.AESOSP] is null then hardcode OE.OETESTCD = 'IOISYMPO'
hardcode_ct(
raw_dat = add_cond(AESOS, AESO == 1 && is.null(AESOSP)),
raw_var = AETERM,
tgt_var = OETESTCD,
tgt_val = 'IOISYMPO',
ct_spec = study_ct,
ct_clst = "C123456",
id_vars = oak_id_vars()
)
If VS.VSTESTCD = 'TEMP', assign the value collected in VTLS1.TEMPLOC to VS.VSLOC.
VTLS1 is the raw dataset name and TEMPLOC is a variable in the raw dataset. VS is the target domain and VSLOC is derived.
#when using in-pipe
|>
assign_ct(
raw_dat = VTLS1,
raw_var = "TEMPLOC",
tgt_var = "VSLOC",
ct_spec = study_ct,
ct_clst = "C12123431",
tgt_dat = add_cond(.data, VSTESTCD == "TEMP"),
raw_filter = NULL,
id_vars = oak_id_vars()
)
If [AECOV19.SPECTYP] is not null, and FA.FATESTCD = 'STATUS' and FA.FAOBJ = 'Severe Acute Resp Syndrome Coronavirus 2' assign the value collected in SPCNM to then FA.FASPEC.
In this example, AECOV19 is the raw dataset name, and SPECTYP is a variable in the raw dataset. The condition also involved the target domain FA. FAOBJ and FATESTCD are previously derived SDTM variables, and FASPEC is the SDTM variable that is currently derived.
#when using in-pipe
|>
assign_ct(
raw_dat = add_cond(AECOV19, is.null(SPECTYP)),
raw_var = "SPCNM",
tgt_var = "FASPEC",
ct_spec = study_ct,
ct_clst = "C1212121",
tgt_dat = add_cond(.data, FATESTCD == "STATUS" && FAOBJ == "Severe Acute Resp Syndrome Coronavirus 2"),
id_vars = oak_id_vars()
)
We may not be able to support this. Take a look and let me know @ramiromagno
MH.MHLOC when MH.MHTERM = [GCAHX.NCITERM] or [GCAHX.NCITERMO]
Map the collected value in GCAHX raw_dat locat raw_varialble to MH.MHLOC when this condition is met MH.MHTERM = [GCAHX.NCITERM] or [GCAHX.NCITERMO]
#when using in-pipe
|>
assign_ct(
raw_dat = GCAHX
raw_var = "SPCNM",
tgt_var = "FASPEC",
ct_spec = study_ct,
ct_clst = "C1212121",
tgt_dat = add_cond(.data, MHTERM %in% GCAHX$NCITERM || MHTERM %in% GCAHX$NCITERM O),
id_vars = oak_id_vars()
)
To help understand that use case involving variables of raw_dat
and tgt_dat
in the same condition, could you share how you currently do it with roak's if_then_else()
interface?
What should happen if the condition results in NA
?
To help understand that use case involving variables of
raw_dat
andtgt_dat
in the same condition, could you share how you currently do it with roak'sif_then_else()
interface?
The {roak} processes it very differently, and it is driven by metadata. The main branch has an example. Please refer to the example mapping CMMODIFTY with the annotation text If different to CM.CMTRT, then CM.CMMODIFY
means the mapping will happen if the value in the collected column CMMODIFY is different from the CMTRT. It is carried out using the spec parameters condition_left, condition_right, and condition_operator. {roak} reads it and processes the logical condition. it is a bit confusing as at the moment the name of the variable in CMMODIFY in the raw_dataset and in the target domain CM. I will change it in the raw_dataset
A mock of automation of this in {roak} will look like
# Derive qualifier CMMODIFY Annotation text = If different to CM.CMTRT then CM.CMMODIFY
if_then_else(
raw_dat = cm_raw,
raw_var = CMMODIFY,
condition_left_raw_dataset = cm_raw,
condition_left_raw_variable = CMMODIFY,
condition_operator = "diffferent_to",
condition_right_sdtm_variable_domain = CM,
condition_right_sdtm_variable = CMTRT,
sub_algorithm = assign_no_ct,
tgt_var = CMDOSETXT,
id_vars = oak_id_vars()
) |>
Can we do something like this in {sdtm.oak} where filtering needs to happen based on a condition in raw_dat and tar_dat?
# Derive qualifier CMMODIFY Annotation text If collected value in CMMODIFY in cm_raw is different to CM.CMTRT then
# assign the collected value to CMMODIFY in the CM domain (CM.CMMODIFY)
assign_no_ct(
raw_dat = cm_raw,
raw_var = "CMMODIFY",
add_cond = (cm_raw$CMMODIFY == .data$CMTRT),
tgt_var = "CMMODIFY",
id_vars = oak_id_vars()
)
What should happen if the condition results in
NA
?
If no records match the criteria, we create the tgt_var as an empty column.
Preferred option to handle complex if condition.
|> assign_ct( raw_dat = GCAHX raw_var = "SPCNM", tgt_var = "FASPEC", ct_spec = study_ct, ct_clst = "C1212121", tgt_dat = add_cond(.data$MHTERM %in% GCAHX$NCITERM || .data$MHTERM %in% GCAHX$NCITERMO), id_vars = oak_id_vars() )
Feature Idea
Introduce functionality to filter the raw and target datasets while performing a mapping.
Example If conditions
If [AESOS.AESO] == 1 and [AESOS.AESOSP] is null then hardcode OE.OEORRES = 'Y'
AESOS is the raw dataset and AESO, AESOSP are variables in the raw dataset. OE is the target domain and OEORRES is the target variable.
If [AESOS.AESO] == 1 and [AESOS.AESOSP] is null then hardcode OE.OETESTCD = 'IOISYMPO'
If VS.VSTESTCD = 'TEMP', assign the value collected in VTLS1.TEMPLOC to VS.VSLOC.
VTLS1 is the raw dataset name and TEMPLOC is a variable in the raw dataset. VS is the target domain and VSLOC is derived.
Involving raw_dat and tgt_dat but separate conditions
If [AECOV19.SPECTYP] is not null, and FA.FATESTCD = 'STATUS' and FA.FAOBJ = 'Severe Acute Resp Syndrome Coronavirus 2' assign the value collected in SPCNM to then FA.FASPEC.
In this example AECOV19 is the raw dataset name, SPECTYP is a variable in the raw dataset. THe condition also involved the target domain FA, FAOBJ nad FATESTCD are previously derived SDTM variables and FASPEC is the SDTM variable that is currently derived.
Involving raw_dat and tgt_dat in the same condition We may not be able to support this.
MH.MHLOC when MH.MHTERM = [GCAHX.NCITERM] or [GCAHX.NCITERMO]
Relevant Input
No response
Relevant Output
No response
Reproducible Example/Pseudo Code
No response