It is intended that gtxpipe should support analyses using data from multiple clinical studies. The desired behaviour is that options(gtxpipe.clinical) should be generalized so that, instead of pointing to a single directory containing clinical data, a vector or list value is supported that points to a set of directories, each containing data from one clinical study. Assuming the user-supplied derivations can be applied within each clinical study dataset, and that genotyping data for all patients from all studies was provided as a single batch, everything else should be handled automatically by the pipeline, including:
Applying derivations within each clinical study dataset then merging (taking care with factor levels)
Generating a source table summarizing numbers of analysed subjects by study
Consider whether the user should have to explicitly specify STUDYID as a covariate in this setting?
It is intended that gtxpipe should support analyses using data from multiple clinical studies. The desired behaviour is that options(gtxpipe.clinical) should be generalized so that, instead of pointing to a single directory containing clinical data, a vector or list value is supported that points to a set of directories, each containing data from one clinical study. Assuming the user-supplied derivations can be applied within each clinical study dataset, and that genotyping data for all patients from all studies was provided as a single batch, everything else should be handled automatically by the pipeline, including: