cdisc-org / ddf-core-poc

This repository will contain the results from the Proof of Concept project.
MIT License
0 stars 1 forks source link

Create rule DDF00040: Each StudyElement must be linked to at least one StudyCell #147

Closed BSnoeijerCD closed 5 months ago

BSnoeijerCD commented 7 months ago

Create rule defined for #144

BSnoeijerCD commented 7 months ago

ElementId not added to test data Excel for studyCell while it is in the model. This involves cross class checking (since the elements are nested in studyDesign and only cross-referenced in studyCell). It is not possible to create this rule based on the current capabilities.

ASL-rmarshall commented 7 months ago

An elementId column is not included in the StudyCell sheet in the test data template because there can be more than one elementId referenced from a StudyCell. According to the current design of the test data template, the elementId values would be listed in the Multiple_Values sheet with parent_entity = "StudyCell" and parent_rel = "elementId". However, this will not be the same as the converted USDM data, where the elementId values will be listed in the string dataset (with parent_entity and parent_rel populated as above), and there will also be a record in the StudyElement table (with rel_type = "reference" and with parent_entity and parent_rel populated as above) for each referenced elementId.

This rule may not be possible even with the new cross-class joining because it might also need new "precondition" engine functionality - either for Match Datasets or for the distinct operation.

ASL-rmarshall commented 7 months ago

Given that, according to the model, the only way a StudyElement can be referenced is from a StudyCell, I think this check could be implemented by getting a distinct list of rel_type within id from StudyElement and then reporting where this list does not contain "reference" (i.e., a StudyElement is defined but not referenced).

This only checks that each StudyElement is referenced. There should also be a check to make sure that a StudyCell should only reference StudyElements from the same StudyDesign.

ASL-rmarshall commented 7 months ago

This is ready for review. The negative and positive test data is the same, except that the negative test data includes a definition of a "High - Start" element that is not referenced. Only one negative is expected.

@BSnoeijerCD @DianeWold Should there be an additional check/rule to ensure that each StudyCell must only reference StudyElements from the same StudyDesign (and similar checks for referenced StudyEpochs and StudyArms)?

DianeWold commented 6 months ago

Checks that a StudyCell only references StudyElements, StudyEpochs, and StudyArms from the same StudyDesign would make sense. I can define a rule for this; it could be split up if necessary.

DianeWold commented 6 months ago

Negative test data included StudyElement_4, which was not in any StudyCell, resulting in one negative result. Positive test data included five StudyElements, each of which was included in at least one StudyCell, resulting in one positive result.

BSnoeijerCD commented 6 months ago

Checks that a StudyCell only references StudyElements, StudyEpochs, and StudyArms from the same StudyDesign would make sense. I can define a rule for this; it could be split up if necessary.

I think it is good, based on this logic to split it up indeed.

ASL-rmarshall commented 3 months ago

Saving POC definition before overwriting for DDF4:

Authorities:
  - Organization: 'CDISC'
    Standards:
      - Name: USDM
        References:
          - Citations:
              - Cited Guidance: 'USDM'
                Document: 'USDM v2.6'
              - Cited Guidance: 'SDTMIG'
                Document: 'SDTMIG v3.4'
                Section: '7.2'
            Origin: USDM Conformance Rules
            Release Notes: ''
            Rule Identifier:
              Id: 'DDF00040'
              Version: '1'
            Validator Rule Message: ''
            Version: '1.0'
        Version: '3.0'
Check:
  all:
    - name: $element_rel_types
      operator: does_not_contain
      value: "reference"
Core:
  Id: "CORE-000449"
  Status: Draft
  Version: '1'
Description: 'Each StudyElement must be linked to at least one StudyCell'
Executability: Fully Executable
Operations:
  - group:
      - id
    id: $element_rel_types
    name: rel_type
    operator: distinct
Outcome:
  Message: 'A StudyElement is not related to any StudyCell'
  Output Variables:
    - parent_entity
    - parent_id
    - id
    - name
    - label
    - description
Rule Type: Record Data
Scope:
  Entities:
    Include:
      - StudyElement
Sensitivity: Record