Open vedhav opened 10 months ago
[Question]: Does the default_cdisc_join_keys contain exhaustive list of CDISC datasets?
No it doesn't and I don't think we should maintain this exhaustive list.
Analysis datasets are named using the ADXXXX convention, where the XXXX
portion is sponsor-defined and created depending on the product. As CDISC continues to evolve, it's too laborious to always have to keep up with the new convention.
At the very least, we should cover the common ones, and upon a quick glance, I felt we have already done this: https://github.com/insightsengineering/teal.data/blob/main/inst/cdisc_datasets/cdisc_datasets.yaml
@lcd2yyz Can I get your opinion on this?
@donyunardi Great explanation! Confirm it's correct.
I actually feel we should maybe remove some from the list, because they are sponsor-defined dataset names, as opposed to common datasets names outlined in CDISC standards or ADaM implementation guides. For examples, ADAETTE
, ADQLQC
, ADCSSRS
, ADEQ5D5L
.
@khatril @shajoezhu @crazycatandy @telepath37 Can I get you opinion on the suggestion to drop these sponsor-defined datasets?
@donyunardi Great explanation! Confirm it's correct.
I actually feel we should maybe remove some from the list, because they are sponsor-defined dataset names, as opposed to common datasets names outlined in CDISC standards or ADaM implementation guides. For examples,
ADAETTE
,ADQLQC
,ADCSSRS
,ADEQ5D5L
.@khatril @shajoezhu @crazycatandy @telepath37 Can I get you opinion on the suggestion to drop these sponsor-defined datasets?
I agree - we should just keep the very common ADaM datasets in our list ("defaults") and allow users to define keys on their ADXXXX datasets if they want.
Thanks @lcd2yyz and @donyunardi
I also agree that we should trim this list, and keep this to minimal. the standards and implementation changes all the time, if it puts too much restirction checks, it is less user-friendly
Thanks for the discussion and for putting it on our radar, I'm also in agreement to trim these back to the common datasets only and allow users the flexibility
What is your question?
When testing about
default_cdisc_join_keys
along with thescda
datasets. I was unable to find the join keys forc("ADAB" "ADPC" "ADPP" "ADTR")
in thedefault_cdisc_join_keys
however they were present inscda
. There were also additional join keys in thedefault_cdisc_join_keys
c("ADSAFTTE" "ADCSSRS" "ADEQ5D5L")
which were missing in thescda
datasets.I thought that CDISC datasets formats contain an exhaustive list of datasets (at least for a given version of the SDTM). My question is do we need to extend the
default_cdisc_join_keys
to include the missing datasets fromscda
? Perhaps also add all the available datasets in thedefault_cdisc_join_keys
intoscda
.Code of Conduct
Contribution Guidelines
Security Policy