degenaro commented 3 years ago

User Story:

As an OSCAL tool developer, I would like Property to have an optional type attribute in order to describe the associated value attribute.

Goals:

The value attribute's type cannot always be easily inferred. It may be an int, array, string, IP-address, etc.

Dependencies:

N/A

Acceptance Criteria

[ ] All OSCAL website and readme documentation affected by the changes in this issue have been updated. Changes to the OSCAL website can be made in the docs/content directory of your branch.
[ ] A Pull Request (PR) is submitted that fully addresses the goals of this User Story. This issue is referenced in the PR.
[ ] The CI-CD build process runs without any reported errors on the PR. This can be confirmed by reviewing that all checks have passed in the PR.

bradh commented 3 years ago

Why does the type matter? Why not always treat it as a string?

Same question in another way: what is a consumer meant to do with the type information?

GaryGapinski commented 3 years ago

A type attribute seems the same as imposing constraints on attribute tuples of a <prop>, particularly that of value. It would be informative, but normative would be a quite a stretch.

Strong typing (i.e., requiring "correct" combinations of <prop> attributes) is not possible in XML Schema. It might be possible in Schematron but only if inferred from the name× ns× class attributes (which seems a tad dicey given the arbitrary nature of ns and class).

ghost commented 3 years ago

This seems like a good call, in some places I'm using a prop to store an array and I simply join all values with a delimiter. If other tools had a way of knowing that from the type parameter, it would make it easier for custom props to interop with each other across oscal fluent applications. A suggested delimiter for arrays might be good to keep things consistent

david-waltermire commented 2 years ago

I believe there are two ways to handle this in OSCAL:

Declarative typing at the data layer: This is the approach you are suggesting. It would use the type to determine the format of the value.
Declarative typing at the data validation layer: This involves binding the validation logic for the value to the ns and name of the property.

There are tradeoffs between these two approaches.

data layer

Handling typing at the data layer requires that we explicitly enumerate a set of types for property values. Every variation in possible types will require a new data type.

To illustrate this, we can consider representing version information as a property value. For version information we may have values that represent different version schemes.

For example:

semver: MAJOR.MINOR.PATCH MS Windows: MAJOR.MINOR.PATCH.BUILD Cisco Versions: MAJOR.MINOR(THROTTLE)TRAIN{REBUILD}

To constrain each of these version types we will need to identify unique types for each.

data validation layer

Handling typing at the data validation layer can make use of the Metaschema constraint system that is already in place within the OSCAL modeling framework.

Using the Metaschema constraint system, defined within the OSCAL models in Metaschema, implementers use the constraints mechanism to define specific validation rules that apply to a model entity based on values in its content model. This approach requires applications to read the OSCAL model definitions in Metaschema to understand the different model entities and their constraints around: the OSCAL data type, cardinality, cross-references, required values for specific entities, et cetera.

For example:

In catalog/metadata/location there are constraints defined that govern the allowed values for the @type.

The following is an excerpt of these constraints:

<constraint>
  <allowed-values target="prop/@name" allow-other="yes">
    <enum value="type">Characterizes the kind of location.</enum>
  </allowed-values>
  <allowed-values target="prop[@name='type']/@value" allow-other="yes">
    <enum value="data-center">A location that contains computing assets. A <code>class</code> can be used to indicate the sub-type of data-center as <em>primary</em> or <em>alternate</em>.</enum>
  </allowed-values>
  <allowed-values target="prop[@name='type' and @value='data-center']/@class" allow-other="yes">
    <enum value="primary">The location is a data-center used for normal operations.</enum>
    <enum value="alternate">The location is a data-center used for fail-over or backup operations.</enum>
  </allowed-values>
</constraint>

These constraint rules declare that:

That properties at this location (prop/@name) may allow type as a name.
That properties with the name="type" (prop[@name='type']) allow a value "data-center".
That properties with the name="type" and value="data-center" (prop[@name='type' and @value='data-center') allow the class to be "primary" or "alternate".

This allows the values for cascading entity values to be declared and validated.

Another example exists within a component in a component defintion.

<matches target="prop[@name='release-date']/@value" datatype="date"/>

This metaschema datatype-based constraint requires that the prop with the name "release-date" must have a value that conforms to the built-in data type date.

The matches construct also allows a regular expression pattern to be specified allowing for arbitrary value constraints. This allows complex logic to be applied without having to enumerate specific data types. The version example above could be implemented using a pattern.

analysis

I am concerned that data layer typing will not scale well, since we will need to enumerate all of the different type possibilities and tools will need to enforce these. OSCAL will always lag behind the needs to define specific types. Tools will further lag behind OSCAL in enforcing specific types. This will lead to poor interoperability for using a data layer type system. This is a not very good outcome IMHO.

The value of data validation layer constraints are that they can be scaled well by different parties. Constraints can be defined in core OSCAL and by 3rd-parties. These constraints can be used at validation time.

The data validation layer constraints, will require tooling to support validation. Support for this is already in liboscal-java and metaschema-java which is used by the OSCAL library. This is a start, but more tooling is needed. We are working on creating more tooling.

For these reasons, I believe we should use data validation layer constraints as the more scalable solution and should not support data layer constraints.

david-waltermire commented 2 years ago

For now I am going to move this to the backlog for future consideration once we have has time to better explore data validation layer constraints.

aj-stein-nist commented 1 year ago

Given the questions around core requirements for this issue and existing comments and labels, I will align the status with "DEFINE Research Needed."

usnistgov / OSCAL

Add type attribute to Property #884