todo Multiple verification

resolved Everyone like it.

I think one of the big hassles with NIEM is having to create two types for each code set. This is something that always has to be explained when we're training new people. We could simplify the syntax, but we'd also lose some benefits. If the benefits aren't leveraged much, might be worth simplifying.

lose the benefits of code extensibility and aggregation (both via unions) and attribute reuse.

The convenience might really be worth the trade-off. And makes it easier for NIEM JSON too.

Current

<xs:simpleType name="MonthCodeSimpleType">
  <xs:restriction base="xs:token">
    <xs:enumeration value="JAN"/>
    <xs:enumeration value="FEB"/>
  </xs:restriction>
</xs:simpleType>

<xs:complexType name="MonthCodeType">
  <xs:simpleContent>
    <xs:extension base="MonthCodeSimpleType">
      <xs:attributeGroup ref="structures:SimpleObjectAttributeGroup"/>
    </xs:extension>
  </xs:simpleContent>
</xs:complexType>

<!-- XML instance could look something like the below -->
<MonthCode structures:metadata="cui-1">JAN</MonthCode>

Add codes directly to CSC types

Instead of having a pair of types, the simple type to define the codes and the complex type with simple content (CSC) to combine the code set with the structures attributes, do both these things in a single CSC type.

You could restrict a built-in schema type directly and add in the attribute group from structures manually, but seems easier to just restrict one of the proxy types from the niem-xs namespace that already does this for you.

<xs:complexType name="MonthCodeType">
  <xs:simpleContent>
    <xs:restriction base="niem-xs:token">
      <xs:enumeration value="JAN"/>
      <xs:enumeration value="FEB"/>
    </xs:restriction>
  </xs:simpleContent>
</xs:complexType>

<!-- XML instance -->
<MonthCode structures:metadata="cui-1">JAN</MonthCode>

Cons

No more code set "extension"

You can extend one of these types to add additional attributes, but you can't "extend" it to add additional codes. You can union simple types together, create a new CSC type and element, and make that element substitutable if possible into the same substitution group or an augmentation point. It's still some work, but copy-pasting all of the original codes into a new custom code set isn't any easier.

There are 9 types in Biometrics and 1 type in MilOps that use unions currently. MilOps adds a few codes to an existing code set; Biometrics composes new code types out of multiple reused code sets.

Maybe unions happen a lot more in IEPDs? Maybe we support the idea of conceptually extending or creating unions of code sets in the metamodel and then let the format translators take care of the messy details. For example, the XML translator could just copy all of the codes over into the new type but maybe add appinfo to capture the sources.

No more attribute reuse of existing code sets

There are two attributes in NIEM currently with a simple code set data type (CBRN and Screening). These attributes could either be refactored, or simple types could be defined in special cases like this. IEPDs would be left to manage on their own if they wanted to create attributes out of existing NIEM code sets.

Alternatives

The alternative simplification would be allowing elements to have simple data types. Especially regarding metadata, it seems like losing the attributes might cause bigger problems. Having a few people duplicate code sets might be worth the trade-off of simplifying the syntax for almost every other schema developer. Not sure about the impact of simple element types though. Is it a deal breaker if you can't portion mark at this granular of a level, or is that at the object level only? IC-ISM? What about other use cases? Think we'll need a lot of feedback on this way to come up with the best option.

cdmgtri commented 1 year ago

Proposed text addition to the NDR:

11.1.3.1. Code complex types

[Definition: code complex type]

A code complex type is a complex type definition schema component for which each value carried by the type corresponds to an entry in a list of distinct conceptual entities.

These types represent lists of values, each of which has a known meaning beyond the text representation. These values may be meaningful text or may be a string of alphanumeric identifiers that represent abbreviations for literals.

Many code complex types are composed of xs:enumeration values. Code complex types may also be constructed using the NIEM Code Lists Specification, which supports code lists defined using a variety of methods, including CSV spreadsheets.

cdmgtri commented 1 year ago

Proposed rule addition to the NDR:

  <sch:pattern id="rule_11-11-2">
    <sch:title>Name of a code type ends in "CodeType"</sch:title>
    <sch:rule
      context="xs:complexType[exists(@name) and (xs:simpleContent/xs:restriction/xs:enumeration)]">
      <sch:report test="not(ends-with(@name, 'CodeType'))" role="warning">Rule 11-11-2: A complex type
        definition schema component that has an enumeration facet SHOULD have a name that ends in "CodeType".</sch:report>
    </sch:rule>
  </sch:pattern>

niemopen / niem-naming-design-rules

Merge Code CSC and Simple Types? #9

todo Multiple verification

resolved Everyone like it.