Open JustKuzya opened 7 months ago
Wendel, @wendellpiez would you be able to take a look at the issue with me, or the guidance in the description is enough?
This issue has been reported to the Metaschema XSLT processor with bug https://github.com/usnistgov/metaschema-xslt/issues/105. That will need to be fixed there to resolve this issue in Metaschema.
@JustKuzya, this looks good so far. The next step would be to flesh out the example in the form of unit tests to replicate the current and correct behaviors.
Next step in prep would be to contrive a test Metaschema containing (isolating) a choice group. Maybe AJ's example in https://github.com/usnistgov/metaschema-xslt/issues/105 could be worked up.
From there we can step forward to writing XSpec tests. Once we know we can validate a change to be "the correction" (while not breaking other things etc.), the actual fix will likely be easy.
Meanwhile, in addition to a logical mapping into JSON syntax such as you provide here, small examples that are "currently-wrong" are also needed (in the sense that the validator is misreporting them). These could be with the example (toy), with an OSCAL example, or both.
The long term: this work dovetails with the schema validation-validation work I am doing in the oscal-xslt repository. Up for discussion is how OSCAL can support external tools developers with robust and solid examples for unit testing and regression testing. (I have ideas.)
Subject to discussion: who is interested in learning to do this, or at least in seeing how it is done.
Here is a starting point, an XSpec written by @aj-stein-nist last year to validate an adjustment in how fields are defined, that have values but not flags (i.e. an edge case), in the JSON Schema: https://github.com/usnistgov/metaschema-xslt/blob/develop/src/testing/issue_235_regression.xspec
Note that targets are expressed in XML according to this rendering: https://www.w3.org/TR/xpath-functions-31/#func-json-to-xml (scroll down to see examples)
This example shows mappings at the level of structures internal to documents, but the tests can be made to work from the top level as well, i.e. examining an entire mapping not just a particular bit of it.
For field testing the JSON Schema, we need examples of the following, in JSON:
And we need a validation runtime that uses the JSON Schema to validate and examines its results to confirm they are correct to this lineup.
My understanding is the current JSON Schema incorrectly reports 'group with both' as valid.
For field testing the JSON Schema, we need examples of the following, in JSON:
- catalog with controls - valid
- catalog with groups - valid
- catalog with both - valid
- catalog with neither - valid
- group with controls - valid
- group with groups - valid
- group with both - invalid
- group with neither - valid
And we need a validation runtime that uses the JSON Schema to validate and examines its results to confirm they are correct to this lineup.
My understanding is the current JSON Schema incorrectly reports 'group with both' as valid.
Indeed the cases listed above are accurate per current definitions and XML Schema which properly represents choice
. As we move forward after we fix the pipeline, is going to be driven by OSCAL JSON users:
1) If no JSON users generated OSCAL content that will become invalid if we break the JSON schema backwards compatibility, then WE MIGHT decide to fix it in the JSON schema, OR
2) we remove choice from XML schema because this will not cause a backward compatibility issue.
This is not a light decision to make and the community input is important. I am preparing the necessary information for the community (coming soon). So, TODAY, per current schema, all cases above are valid per JSON Schema. and are as listed per XML Schema. They need to be sinked but how, it will be based on the community's feedback.
@JustKuzya aggregated all cases where choice
is used:
# | XSD | xs:choice occurences |
---|---|---|
1 | oscal_assessment-plan_schema.xsd | 18 |
2 | oscal_assessment-results_schema.xsd | 18 |
3 | oscal_catalog_schema.xsd | 15 |
4 | oscal_complete_schema.xsd | 24 |
5 | oscal_component_schema.xsd | 14 |
6 | oscal_poam_schema.xsd | 18 |
7 | oscal_profile_schema.xsd | 19 |
8 | oscal_ssp_schema.xsd | 14 |
The choice is used in multiple places besides group/control some are clear (no attributes, which makes them min-max = 1-1), some are less clear in what the intent was, especially when there is a choice
defined but it has only one element:
address/location-uuid,
parameter-Field/parameter-Assembly,
include-all/select-by-id(-Assembly) profile
flat/as-is/custom.
group/insert-controls
value[oscal-control-common-parameter-value-FIELD]/[]select[oscal-control-common-parameter-selection-ASSEMBLY]
include-all/include-controls
on-date/with-date-range/at-frequency
include-all/include-control
include-all/include-objective
include-all/include-subject
?? choice with only one element ref->blockElementGroup (only one)
headingzblockElementGroup -> h1, h2,…h6
blockElementGroup -> heading/list/block/p/table/img
blockTextGroup -> pre/hr/blockquotetype
listGroup -> ul/ol
[0-Infinity] inlineMarkup/list/blockText/headingBlockElement(-Group)/p
[1-Infinity] tablerowType -> td/th
?? blockQuoteType -> [0-Infinity] blockElementgroup (only one)
?? inlineMarkupType -> [0-Infinity] inlineMarkupGroup (only one)
inlineMarkupGroup -> a/insert/br/phraseMarkupGroup
code/em/i/b/strong/sub/sup/q/img
anchortype -> [0-Infinity] phraseMarkupGroup
choice
s that have a single element.choice
constraints.Cool - except xsd:choice
in the XSD is not the same as choice
in the Metaschema sources. It is not as bad as this. (Many xsd:choice
are there for other reasons. Including those that have just a single element, as this is machine-generated code.)
A couple of things to add:
choice
in JSON is called for - I will do this if no one does it firstFinally: additionally to the two suggestions made so far, there is a third 'in-between' technical solution, namely reflecting the rule (group
XOR control
inside group
) not as a modeling restriction (with choice
) but with a constraint, which reports their appearing together, maybe as a WARNING.
(Indeed this is what we might have had in mind when we pondered this problem in forgotten-times.)
Thank you @wendellpiez .
Many
xsd:choice
are there for other reasons. Including those that have just a single element, as this is machine-generated code.
Where do we have those reasons documented so all current and future members of the team, and our collaborators have a clear understanding?
A couple of things to add:
- Some due diligence on
oscal-cli
support for choice in JSON is called for - I will do this if no one does it first- I will also continue prototyping a solution that correctly 'blows out' the JSON Schema (it doesn't hurt to look)
Both are needed, and I greatly appreciate your initiative, but the expertise and understanding of the choice
usage and interpretation need more transparency per A. concern above. Since we need to present the problem to our community, and get their perspective in terms of the direction to go, I would like to work with you on documenting the types of choice
use cases and where are they used (expending on what @JustKuzya started above.
Finally: additionally to the two suggestions made so far, there is a third 'in-between' technical solution, namely reflecting the rule (group XOR control inside group) not as a modeling restriction (with choice) but with a constraint, which reports their appearing together, maybe as a WARNING.
I like it, and we need to present it next to the others more 'extreme'. An alignment is needed. Thank you again.
A. requires a canonical specification of a Metaschema -> XSD mapping or at least requirements for that. The Metaschema developers have discussed this but there are also good reasons against it.
What is actually required (and does not require explaining both XSD and Metaschema to all OSCAL users) would be a set of field tests that exhaustively demonstrate expected behaviors by validators, plus ideally a harness for using it.
Indeed this is essential if OSCAL plans to own its own validations, so it can both adjudicate among OSCAL implementations (which ones are 'correct') and some day migrate to tomorrow's schema technology.
I have built one of these for the XSD and am proceeding to extend it to include JSON Schema, albeit not yet automate it - but see https://github.com/usnistgov/metaschema-xslt/issues/110
Accordingly, I think we need three things at present (in addition to work on https://github.com/usnistgov/metaschema-xslt/issues/105):
choice
in OSCAL metaschema sources - not xsd:choice
in XSD - to help determine impacts of various optionsI'm proceeding with these, albeit not in this order - feel free to pitch in or ask how to do so.
This bug is being fixed upstream (see https://github.com/usnistgov/metaschema-xslt/issues/105 and https://github.com/usnistgov/metaschema-xslt/pull/108 there a WIP PR exists towards supporting choice). Once metaschema-xslt/#105 is fixed and the JSON catalog schema gets updated, the existing OSCAL JSON catalogs become invalid IF the authores used groups
and controls
as opposed to groups
or controls
. Based on all conversations had, the suggested way forward until OSCAL 2.0 is released where
Describe the bug
Metachema definition
Metaschema defines the top-level catalog/group (recursing by refs) as follows (note; it is a
choice
without attributes):I assume that the defaults in Metaschema match XSD ones and fall back to min/max-Occurs defaulting to 1
XSD schema definition
Ends up as follows:
which considering mixOccurs and maxOccurs defaulting to 1 => we have effectively an Exclusive OR
JSON Schema on the other hand
Has a plain no-choice-restricted definition (excerpt is larger than the two above to illustrate the context):
while the XSD equivalent, probably should be:
Who is the bug affecting
People who try to validate "inventive" and "non-classical" JSON catalogs
When groups and controls mixed together in "twigs and leaves in same bag"-style, which is allowed by OSCAL JSON Schema, but isn't allowed by Metaschema, XSD, and OSCAL-CLI tool, the catalog-authors get confused.
What is affected by this bug
Modeling
How do we replicate this issue
Expected behavior (i.e. solution)
Difference should be not be noticeable
Other comments
No response
Revisions
No response