Externalize constraints from released models

aj-stein-gsa commented 3 days ago

User Story

As a developer of Metaschema-based OSCAL tooling, in order to more effectively manage custom constraints and NIST-maintained constraints in easy-to-combine way, I would like the constraints for Catalog, Profile, SSP, Component Definitions, AP, AR, and POAM models defined and maintained separately of the model file in metachema-meta-constraints, not directly within their models.

Goals

[ ] Improve maintenance of the constraints and underlying models without changing current schema outputs or degrading documentation; and
[ ] Simplify constraint management separate of changes within models that require model updates, whether changes are non-breaking or breaking changes; and
[ ] Allow developers to combine constraints for the respective models without core NIST constraints and custom constraints contradicting, because the former cannot be ignore or suppressed without custom processing rules in multiple tools on a case-by-case basis

Dependencies

N/A

Acceptance Criteria

[ ] All OSCAL website and readme documentation affected by the changes in this issue have been updated. Changes to the OSCAL website can be made in the docs/content directory of your branch.
[ ] A Pull Request (PR) is submitted that fully addresses the goals of this User Story. This issue is referenced in the PR.
[ ] The CI-CD build process runs without any reported errors on the PR. This can be confirmed by reviewing that all checks have passed in the PR.

(For reviewers: The wiki has guidance on code review and overall issue review for completeness.)

Revisions

No response

iMichaela commented 3 days ago

@aj-stein-gsa - Thank you for opening this issue. I am concerned that GRC tools or other implementors that developed tools based on the existing metaschema definitions of the models will be drastically affected, and this proposal MUST be first presented and discussed with the community members. I personally like the idea of having all constraints separate from the schemas but those changes are major and the impact to NIST's existing pipeline, documentation and schema generation and the impact to OSCAL implementers needs to be assessed. (cc -@wendellpiez )

aj-stein-gsa commented 2 days ago

@aj-stein-gsa - Thank you for opening this issue. I am concerned that GRC tools or other implementors that developed tools based on the existing metaschema definitions of the models will be drastically affected.

It should not affect the models at all, only where the data is located in multiple files, not one like before.

and this proposal MUST be first presented and discussed with the community members.

Let us know how that need be presented and discussed. I will try to see how I and the FR Automation Team can accomodate.

iMichaela commented 2 days ago

@aj-stein-gsa - Thank you for opening this issue. I am concerned that GRC tools or other implementers that developed tools based on the existing metaschema definitions of the models will be drastically affected.

It should not affect the models at all, only where the data is located in multiple files, not one like before.

and this proposal MUST be first presented and discussed with the community members.

Let us know how that need be presented and discussed. I will try to see how I and the FR Automation Team can accomodate.

I already sent a note in Gitter to all members, and I will try to send soon an email. The implementers using the metaschema definition files most likely will have to update their parsing of the OSCAL definition files. Informing them and getting their perspective is important. I'll be on travel in early October, but when I'll return, we can also have a meeting with the community members to explain the proposed changes and to collect the feedback . Personally, I think this change will help move all FedRAMP constraints that currently exist in the core OSCAL to FedRAMP, RMF constraints to an RMF constraint file and core OSCAL constraints (if any left) to a separate file.

RS-Credentive commented 2 days ago

I agree that there could be a significant impact for externalizing existing constraints. AJ is right that it "shouldn't" matter, but it depends on how the tools process the underlying metaschema.

It would also help to understand whether we are proposing that there should be an optional way to express external constraints or whether all constraints must always be externalized. Is there any reason to externalize existing constraints if they are currently working fine?

I think it's reasonable to include a set of "base oscal" constraints that are mandatory, and there's no reason not to include them in the models themselves, and allow other entities to define additional constraints under custom namespaces. If there is interest in relaxing some of the base oscal constraints that motivates this request, that might be a relatively easy (non-breaking) ask.

aj-stein-gsa commented 2 days ago

I agree that there could be a significant impact for externalizing existing constraints. AJ is right that it "shouldn't" matter, but it depends on how the tools process the underlying metaschema.

By and large, I have found few developers implementing software that process Metaschema at all, the majority admittedly hand-code data or use code generation from XML Schema or JSON Schema. So when I say shouldn't, I just want to be clear I am addressing the majority and gravity of toolmakers. Publicly or privately, no one has demonstrated otherwise.

It would also help to understand whether we are proposing that there should be an optional way to express external constraints or whether all constraints must always be externalized. Is there any reason to externalize existing constraints if they are currently working fine?

If NIST wants to provide data, tools, and docs in parallel where any two constraints must conflict at the same time, it makes building infrastructure to support them incredibly difficult. (I will not say impossible, but it is non-trivial and I had not proposed it sooner because there were other logistical hurdles.)

I think it's reasonable to include a set of "base oscal" constraints that are mandatory, and there's no reason not to include them in the models themselves, and allow other entities to define additional constraints under custom namespaces. If there is interest in relaxing some of the base oscal constraints that motivates this request, that might be a relatively easy (non-breaking) ask.

tl;dr: So I agree with you, a key theme to this request: "I don't what the base are, how do we organize non-base ones and get to bare essentials?"

Much more detail below on that topic:

You actually inspired my awareness of this theme last, in a 2023 workshop I believe you raised your hand mid-lesson and asked why certain parts of a control in a catalog or profile must be there, maybe the SSP, and I admitted that it is a NIST RMF requirement for SP 800-37 and SP 800-53 "bleeding into" core model requirements. You were not working on a catalog/SSP for RMF, so that is a concern.

Adding much to the core in the same file, even if multiple are combined, essentially bottlenecks NIST staff to curate it all. Michaela and others can comment if that is something they want, but that is a resource-intensive commitment.

I do not think there is a solid agreement on what is mandatory "for general OSCAL use" versus public NIST guidance on "use for generlized NIST guidance on the application of NIST SP 800-37 Risk Management Framework and SP 800-53/53A controls and assessment objectives" versus "Agency X's application of RMF that subsets NIST" versus "use of CSF 2.0" or "use of CSA CCM 4.x controls" and a host of others. If you keep them in the same file, you have to edit the core model to surpress them and essentially make slightly divergent copies of models that are more difficult to maintain in the long run. Ask me how I know!

wendellpiez commented 2 days ago

:+1: I am strongly in favor of both layering the constraint sets to make them more usable and useful, and of reducing the number of constraints in core modules (which can amount to blocking conditions), i.e. considered applicable to all OSCAL including simple (unqualified) OSCAL data.

wendellpiez commented 2 days ago

@aj-stein-gsa also makes a good point about the maintenance model for external rules sets.

Layered constraints in a different domain:

At https://jats.nlm.nih.gov/files.html you can download schemas for NISO JATS (XML)
https://www.ncbi.nlm.nih.gov/pmc/pub/validation/ describes core validation support in addition to other tools: previewers and a 'Style Checker' tool for NISO JATS data submitted to Pubmed Central (PMC)
https://jats4r-validator.niso.org/ tool performing JATS validation plus adding JATS4R recommendations
https://jats.taylorandfrancis.com/jats-guide/tools/schematron/ - one publisher's rule set, over and above core JATS

Note that rules coming from at least four constituencies (and maintained by four different groups) are represented:

All JATS users
Submitters to Pubmed Central (JATS + PMC rules)
Organizations conformant to JATS4R guidelines enabling interchange - JATS4R is maintained outside JATS
Users of JATS at particular publishers / within particular JATS systems

I.e. the different constituencies reflect different stakeholders in the data with different systems, requirements and levels of investment.

My feeling is that this is healthy and OSCAL would do well to try and support a similar 'distributed governance' model enabling users and organizations to mix and match rule sets to requirements.

RS-Credentive commented 2 days ago

👍 I am strongly in favor of both layering the constraint sets to make them more usable and useful, and of reducing the number of constraints in core modules (which can amount to blocking conditions), i.e. considered applicable to all OSCAL including simple (unqualified) OSCAL data.

Is the suggestion effectively that the core OSCAL schemas will essentially become structural (e.g. these are the basic set of flags, fields and assemblies and their relationships), and that constraints will always be expressed as external "on top of" elements? Essentially, "no more inline constraints"?

If constraints are externalized, how can a specification mandate the use of a subset of constraints, or is the suggestion that there shouldn't be mandatory constraints in OSCAL at all?

RS-Credentive commented 2 days ago

It's also important to identify how we can use namespaces effectively for management of constraints on elements that don't include a namespace flag. Constraints on Props are easy because they have a namespace. What about elements w/o a namespace? This may only be a theoretical problem - I admit I can't think of a specific area where there would definitely be an issue.

wendellpiez commented 2 days ago

I don't see any problem with inline constraints as such, as long as their scope is intended to be across any and all OSCAL conformant to the model in question.

It is merely the need to align the layers with both governance and scope of application that I am trying to point out - something clearly not lost on all-yall. :-)

wendellpiez commented 2 days ago

FWIW, a detail.

Currently a uniqueness constraint such as the uniqueness of an ID value or role/@id (across a value subset, data instance or a data set) is underspecified in OSCAL, mainly because the scope of uniqueness is conceived across a set of documents not just a single document. An implementation must decide how to scope that test for distinctiveness.

Despite that undefined aspect of the spec, this is probably an example of a constraint that needs to be applied across any OSCAL in order to ensure data integrity and fitness. AFAIK it is only actually implemented in oscal-cli (and untested prototype InspectorXSLT) via an is-unique constraint.

(Puzzlingly, current docs say that is-unique applies only to assemblies - whereas flags and even field values might need to be controlled this way. In OSCAL metaschemas, is-unique targets flags.)

usnistgov / OSCAL