An EDC and Data Standard agnostic SDTM data transformation engine that automates the transformation of raw clinical data in ODM format to SDTM based on standard mapping algorithms
raw_unk - Format of the unknown date and time components.
tgt_var The target SDTM variable.
tgt_dat - Optional parameter. This is the target_dataset that was created in the previous step.
id_vars - Optional parameter. A vector with the string that will be used to merge to the target_dataset
Output:
A dataframe with oak_id_vars and target_sdtm_variable if tgt_dat & id_vars are not provided
tgt_dat with one additional variable target_sdtm_variable
Relevant Input
sdtm spec
study_number
raw_source_model
raw_dataset
raw_dataset_ordinal
raw_dataset_label
raw_variable
raw_variable_label
raw_variable_ordinal
raw_variable_type
raw_data_format
raw_codelist
study_specific
annotation_ordinal
mapping_is_dataset
annotation_text
target_sdtm_domain
target_sdtm_variable
target_sdtm_variable_role
target_sdtm_variable_codelist_code
target_sdtm_variable_controlled_terms_or_format
target_sdtm_variable_ordinal
origin
mapping_algorithm
entity_sub_algorithm
target_hardcoded_value
target_term_value
target_term_code
condition_ordinal
condition_group_ordinal
condition_left_raw_dataset
condition_left_raw_variable
condition_left_sdtm_domain
condition_left_sdtm_variable
condition_operator
condition_right_text_value
condition_right_sdtm_domain
condition_right_sdtm_variable
condition_right_raw_dataset
condition_right_raw_variable
condition_next_logical_operator
merge_type
merge_left
merge_right
merge_condition
unduplicate_keys
groupby_keys
target_resource_raw_dataset
target_resource_raw_variable
lp_study
e-CRF
MD1
27
Concomitant Medications
MDBDR
Start date
5
DateTime
dd- MMM- yyyy
NA
FALSE
1
FALSE
CM.CMSTDTC
CM
CMSTDTC
Timing Variable
NA
ISO 8601
39
CRF
ASSIGN_NO_CT
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
lp_study
e-CRF
MD1
27
Concomitant Medications
MDEDR
End date
10
DateTime
dd- MMM- yyyy
NA
FALSE
1
FALSE
CM.CMENDTC
CM
CMENDTC
Timing Variable
NA
ISO 8601
40
CRF
ASSIGN_NO_CT
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
lp_study
e-CRF
MD1
27
Concomitant Medications
MDETM
End time
11
DateTime
HH nn
NA
FALSE
1
FALSE
CM.CMENDTC
CM
CMENDTC
Timing Variable
NA
ISO 8601
40
CRF
ASSIGN_NO_CT
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
raw_datasaet = MD1
oak_id
raw_source
patient_number
MDBDR
MDEDR
MDETM
1
MD1
375
NA
NA
NA
2
MD1
375
15-Sep-20
NA
NA
3
MD1
376
17-Feb-21
17-Feb-21
NA
4
MD1
377
4-Oct-20
NA
NA
5
MD1
377
20-Jan-20
20-Jan-20
10:00:00
6
MD1
377
UN-UNK-2019
UN-UNK-2019
NA
7
MD1
377
20-UNK-2019
20-UNK-2019
NA
8
MD1
378
UN-UNK-2020
UN-UNK-2020
NA
9
MD1
378
26-Jan-20
26-Jan-20
07:00:00
10
MD1
378
28-Jan-20
1-Feb-20
NA
11
MD1
378
12-Feb-20
18-Feb-20
NA
12
MD1
379
10-UNK-2020
20-UNK-2020
NA
13
MD1
379
NA
NA
NA
14
MD1
379
NA
17-Feb-20
NA
raw_var = "MDBDR"
raw_fmt = "d-m-y"
raw_unk = c("UN", "UNK")
tar_var = "CMSTDTC"
tar_dat = cm_inter - Let's assume CMTRT, CMINDC variables are already derived and CMSTDTC is the third variable being processed
oak_id
raw_source
patient_number
CMTRT
CMINDC
1
MD1
375
BABY ASPIRIN
NA
2
MD1
375
CORTISPORIN
NAUSEA
3
MD1
376
ASPIRIN
ANEMIA
4
MD1
377
DIPHENHYDRAMINE HCL
NAUSEA
5
MD1
377
PARCETEMOL
PYREXIA
6
MD1
377
VOMIKIND
VOMITINGS
7
MD1
377
ZENFLOX OZ
DIARHHEA
8
MD1
378
AMITRYPTYLINE
COLD
9
MD1
378
BENADRYL
FEVER
10
MD1
378
DIPHENHYDRAMINE HYDROCHLORIDE
LEG PAIN
11
MD1
378
TETRACYCLINE
FEVER
12
MD1
379
BENADRYL
COLD
13
MD1
379
SOMINEX
COLD
14
MD1
379
ZQUILL
PAIN
id_vars - oak_id_vars()
Relevant Output
Option 1 - When used without merde and id_vars. The function call is
Feature Idea
The
assign_datetime
algorithm will be implemented as a function. As referred in the documentation, this will be used to assign a value.Algorithm Description - One-to-one mapping of the raw date and/or time source to a target SDTM variable in iso8601 format
Example mappings - CM.CMSTDTC AE.AEENDTC
function call
Input: raw_dat The raw dataset.
raw_var The raw variable.
raw_fmt - Format of the raw variable
raw_unk - Format of the unknown date and time components.
tgt_var The target SDTM variable.
tgt_dat - Optional parameter. This is the target_dataset that was created in the previous step.
id_vars - Optional parameter. A vector with the string that will be used to merge to the target_dataset
Output: A dataframe with oak_id_vars and target_sdtm_variable if tgt_dat & id_vars are not provided tgt_dat with one additional variable
target_sdtm_variable
Relevant Input
sdtm spec
raw_var = "MDBDR"
raw_fmt = "d-m-y"
raw_unk = c("UN", "UNK")
tar_var = "CMSTDTC"
tar_dat = cm_inter - Let's assume CMTRT, CMINDC variables are already derived and CMSTDTC is the third variable being processed
id_vars - oak_id_vars()
Relevant Output
Option 1 - When used without merde and id_vars. The function call is
output dataset from the function
Option 2 - When used with merging and id_vars
Output dataset
Reproducible Example/Pseudo Code