Closed ASL-rmarshall closed 3 months ago
Hi @BSnoeijerCD - I've asked for your review for this pull request. I'm not expecting you to look at the code changes (unless you want to), but please could you download and review the USDM_2.11_Test_Data_Template.xlsx file. Please then start your review of this change request and either approve it (if everything looks OK) or post comment(s) with your findings. Thanks!
@ASL-rmarshall : some comments
and questions:
I will check the content of sheets as well today.
@BSnoeijerCD : some replies
@ASL-rmarshall : some comments
- sheets added for boolean, float, null, string
Yes, this is deliberate. It is to mimic the behaviour of the USDM data service's conversion of JSON files, which creates separate datasets each showing the values for a primitive data type (boolean, float, string) or attributes with null
as a value. See the Excel files of the USDM 2.11 example files that I posted in SharePoint.
- no label for BiomedicalConceptSurrogate class
This is because there is no "CT Item Preferred Name" specified for the BiomedicalConceptSurrogate
entity in the "DDF Entities&Attributes" sheet of the USDM 2.11 USDM_CT.xlsx file (cell G194).
- study class will not have a parent. So corresponding items in test data sheet are not applicable.
For API data, the Study
class has the (API-only) Wrapper
class as its parent. Again, see the Excel representations of the USDM 2.11 example files that I posted in SharePoint.
and questions:
- How should the boolean be filled in the example datasets? Y/N, True/False, 0/1 ?
Either as TRUE/FALSE (the Excel boolean values) or true/false as text values: Note that, by default, the first test data row in every sheet (i.e., the first row under the yellow header rows) is formatted to accept text values in every cell (the cells are formatted as "Text" instead of "General"). If you have multiple rows of test data to enter, you can use the Format Painter functionality to copy this text formatting onto additional rows. This will also prevent ISO8601 date values being converted automatically to Excel date values.
- Full name of class StudyProtocolDocumentVersion too long. Missing the last n. Should we refer in Core to StudyProtocolDocumentVersio (without the n)?
This is a known issue. Our eventual intention is to be able to remove the ".xpt" suffix from the sheet names (see cdisc-org/cdisc-rules-engine#684), at which point there will be enough space for the full class name. Until then, if we need to reference the class name as a domain
within the check specification - e.g., in a distinct
operation - we would have to specify it without the "n" (and in upper case). It currently doesn't matter what you put in Scope
because it's ignored anyway, so I would recommend using the full class name in Scope.
I will check the content of sheets as well today.
@ASL-rmarshall based on the content, I have the following comments:
@BSnoeijerCD Thanks. I have made some updates. Please download and check the updated Excel file.
@ASL-rmarshall based on the content, I have the following comments:
- BiomedicalConceptProperty, ScheduledDecisionInstance and AgentAdministration: something is going wrong in column J-M. Headers missing / content split.
I'm not sure what this means - is it just that there appears to be a gap between the header rows for some of the columns when the sheet contains another column with a description (or other characteristic) that wraps onto multiple lines within its cell?
- Condition: Context and AppliesTo multiplied for each potential target (2x Context, 5x AppliesTo) but not distinguishable for Core I think. Also naming should be appliesToIds and contextIds
Corrected - multiple occurrences of the same relationship name are now collapsed into a single column with the cardinality (row 4) shown a " / " delimited list of all the references. Column names corrected to be appliesToIds
and contextIds
.
Column names corrected to be <singular noun>Ids
for all the following:
- StudyCell.elements should be studyCell.elementIds (not sure if I missed others that cross-reference and are plural)
- StudyArm.populations should be studyArm.populationIds
- studyElement.studyInterventions should be studyElement.studyInterventionIds
- activity.biomedicalConcepts should be activity.biomedicalConceptIds
- activity.bcCategories should be activity.bcCategoryIds
- activity.bcSurrogates should be activity.bcSurrogateIds
- BiomedicalConceptCategory.children and BiomedicalConceptCategory.members should be childIds and MemberIds
- ScheduledActivityInstance.activities should be activityIds
These two have also been corrected to show previousId
and nextId
to match the API spec, but according to the UML, they should be previousIds
and nextIds
- I've sent you an email about the inconsistency.
- NarrativeContent.previous should be previousId
- NarrativeContent.next should be nextId
@ASL-rmarshall : checked. All ok now.
Changes applied to
create_template.py
:string
boolean
float
null
parent_entity
,parent_id
,parent_rel
for the Study sheet.rel_type
for all sheetsinstanceType
Id
/Ids
attributes in the API spec to handlecontext
andappliesTo
in USDM v2.11.Removed generically named template file and replaced with version-specific versions for USDM v2.6 (previous design) and v2.11 (including all updates listed above)..