Lots of checks have been specified for things like:
If there are multiple codes for xxx, they must be unique.
The same code must not be used in xxx and yyy
The xxx attribute must be coded according to the yyy [DDF|SDTM] codelist
In order to convert these check specifications into executable rules, we need to define and agree the following criteria:
Which attributes of the Code class should be used to determine uniqueness? Should this just be codeSystem and code (if two instances of the Code class have the same values for both codeSystem and code they are considered to be the "same code")?
What determines correct vs incorrect coding according to a particular codelist?
Are there specific values that must be used in the codeSystem attribute? If not, there probably should be (e.g., an extensible codelist of known code systems), so the everyone uses the same value to identify specific code systems.
Should we differentiate between DDF and SDTM codelists somehow (e.g., in the codeSystem)? The current example data seems to use just "http://www.cdisc.org" for all CDISC terminology. Some codelists appear in both SDTM and DDF terminology. As DDF terminology has not existed for as long as SDTM terminology, a given SDTM codelist may have a different set of available versions for DDF vs SDTM terminology. If we don't differentiate between DDF and SDTM terminology when representing instances of the Code class, then this distinction should not be included in check specifications.
For CDISC terminology, should codeSystemVersion always be:
An ISO 8601 date value?
A date that matches an available release date for the set of terminology (i.e., SDTM or DDF) being used?
For other code systems (e.g., SNOMED, MedDRA, ISO 3166, etc.) should the version always be provided in a specific format (depending on the code system)?
Correct vs incorrect coding according to a codelist depends on whether the codelist is extensible or not.
Are any/all of the CDISC DDF codelists extensible? The extensible flag is not populated in the DDF terminology spreadsheet.
If a codelist is not extensible, should we simply report an error if the code is not in the codelist?
Regardless of codelist extensibility, should the following be applied:
If the code is in the codelist, report an error if the decode does not match the decode from the codelist?
Is it always the "CDISC Submission Value" that is used for decode checking for CDISC codelists?
Should the same text case (upper/lower/mixed) be considered when checking for matching decode values?
If the decode is in the codelist, report an error if the code does not match the code form the codelist? (Same questions about decode matching).
code, codeSystem and codeSystemVersion define uniqueness (e.g. code stays the same but decode is tweaked across versions, so version is needed I think)
I don't think we need to differentiate between DDF and SDTM, they should not differ for a given code list. We do define where a code list comes from in the CT so we can hold this within CORE for a given class attribute
For CDISC, codeSystemVersion should be in the set of release dates and only that set (the 3 / 6 month releases and their dates)
Don't check other code systems for the moment, we don't know the set of CT someone might use, except for ISO 3166, see below
Extensible code lists, either it will be a CDISC code from the specified code list or it will be some other code system (probably sponsor) which we cannot check
For CDISC, we should check the decode for the version specified, for other systems we don't check anything
For the CDISC codeSystem we should use 'http://www.cdisc.org'. See this list here for suggested values for other code systems HL7 Code Systems values though we wont check them, we can suggest this set in the IG. Note the ISO 3166 alpha 2 and 3 URLs
While I suggest we only check CDISC we could check ISO 3166 values. This page is useful ISO 3166
Lots of checks have been specified for things like:
In order to convert these check specifications into executable rules, we need to define and agree the following criteria:
Code
class should be used to determine uniqueness? Should this just becodeSystem
andcode
(if two instances of theCode
class have the same values for bothcodeSystem
andcode
they are considered to be the "same code")?codeSystem
attribute? If not, there probably should be (e.g., an extensible codelist of known code systems), so the everyone uses the same value to identify specific code systems.codeSystem
)? The current example data seems to use just "http://www.cdisc.org" for all CDISC terminology. Some codelists appear in both SDTM and DDF terminology. As DDF terminology has not existed for as long as SDTM terminology, a given SDTM codelist may have a different set of available versions for DDF vs SDTM terminology. If we don't differentiate between DDF and SDTM terminology when representing instances of theCode
class, then this distinction should not be included in check specifications.codeSystemVersion
always be:code
is not in the codelist?code
is in the codelist, report an error if thedecode
does not match the decode from the codelist?decode
checking for CDISC codelists?decode
values?decode
is in the codelist, report an error if thecode
does not match the code form the codelist? (Same questions aboutdecode
matching).