usnistgov / OSCAL

Open Security Controls Assessment Language (OSCAL)
https://pages.nist.gov/OSCAL/
Other
667 stars 181 forks source link

Define common objects/models in a single place #731

Closed drsm79 closed 3 years ago

drsm79 commented 4 years ago

User Story:

As an OSCAL tool developer, I would like common objects/models to be defined in one schema file.

Goals:

Dependencies:

None that I'm aware of, related to #307

Acceptance Criteria

{The items above are general acceptance criteria for all User Stories. Please describe anything else that must be completed for this issue to be considered resolved.}

butler54 commented 4 years ago

To clarify @drsm79's issue:

wendellpiez commented 4 years ago

This is a big topic and since there is a new Metaschema infrastructure being rolled out, there will be changes to both XML and JSON schemas going forward, so it is also an open-ended one.

In general I would also suggest looking at the metaschemas not just the schemas, since that is our point of leverage for these issues. Going forward there should be opportunities to consider this as a feature request for the Metaschema tech. However there might also be low-hanging fruit there. (ORM straight from a metaschema, why not?)

butler54 commented 4 years ago

I think ORM from the metaschema ia s good idea. The question here is probably also how to enable scaling of oscal across multiple programming languages / paradigms quickly.

The approach initially used was to leverage the openapi-generator to provide the model generation capability off of a collapsed json (or yaml) schema. Quality varied based on the generators, however, perhaps the higher level requirement is metaschema to openapi?

Irrespective I think the biggest issue in the short to medium term is conflicting definitions across schemas within the family. This will increase flexibility (IMO) for generation irrespective of the origin source.

david-waltermire commented 4 years ago

I wrote some code to do metaschema-based code generation in Java. It generates Java code that can read and write JSON, YAML, and XML that corresponds to a given metaschema.

We have plans to write similar code generators for other programming languages (i.e., Javascript/Typescript/NodeJS, Python, etc.).

@butler54 Where are you seeing "conflicting definitions across schemas within the family"? We would like to fix these. Can you list where you are seeing these conflicts?

wendellpiez commented 4 years ago

Metaschema to openAPI spec, yeah ... at any rate, seeing the delta there (and what enhancements might be called for to produce something really nice) could be interesting and maybe even useful.

butler54 commented 4 years ago

@david-waltermire-nist - I'll get you a list (and my dumb script) shortly. I'll look into the metaschema idea - it would be great if we could go straight from metaschema => python (which is the current focus).

butler54 commented 4 years ago

Please excuse the delay - the results below are where inconsistencies where identified across the various schemas. The methodology used: 1) Excluded comments 2) Was sequential in nature e.g. the 'reference' definition of a object was defined by first appearance using the following ordered list:

['oscal_catalog_schema.json', 'oscal_profile_schema.json', 'oscal_ssp_schema.json',
                    'oscal_component_schema.json', 'oscal_assessment-plan_schema.json', 
                    'oscal_assessment-results_schema.json', 'oscal_poam_schema.json']
Inconsistent definitions between sar/local-definitions and poam/local-definitions: 
Inconsistent definitions between component/component and poam/component: 
Inconsistent definitions between ssp/component and component/component: 
Inconsistent definitions between ssp/control-implementation and component/control-implementation: 
Inconsistent definitions between ssp/implemented-requirement and component/implemented-requirement: 
Inconsistent definitions between ssp/statement and component/statement: 
Inconsistent definitions between profile/value and component/value: 
Inconsistent definitions between profile/part and sar/part: 
Inconsistent definitions between profile/all and sar/all: 
Inconsistent definitions between profile/set-parameter and component/set-parameter: 
Inconsistent definitions between catalog/group and profile/group: 

Happy to discuss further.

bradh commented 4 years ago

Also see #307 and #444.

wendellpiez commented 4 years ago

So the real question here is whether this is a feature or a bug. The Metaschema tech is designed to support this. While the models can acquire structures from each other or a shared pool, they can also replace them with other structures with the same name (homonyms).

Another way of asking the question is, why is OSCAL defined in multiple models across its different functional layers, and not just one big model that defines all of them? Lots of reasons, mostly having to do with flexibility, adaptability and agility (I just deleted a bunch of tl;dr). I can't prove a counter-factual, but my guess is that this flexibility and agility have been essential to getting us this far -- and it will remain important for adopters as well.

At the same time I don't think it's an open-and-shut case. One might apply a transformation to schemas and data applying an alpha-conversion ...

ssp/implemented-requirement -> ssp/ssp-implemented-requirement
ssp//component/implemented-requirement -> ssp//ssp-component/ssp-component-implemented-requirement
poam/component -> poam/poam-component
component/component -> component-component/component-component

(This prepends names with names of parents and ancestors at the root thereby hopefully disambiguating everything.)

... and this would be interesting, potentially giving more insight into the design tradeoffs.

So perhaps some experiments are in order. I'd also be curious as to how the perceptions of this issue break across users of the XML vs JSON models and data sets.

Additionally, it is quite possible that some of these cases might be consolidated, even without introducing a rule prohibiting homonymy across models. Maybe we need an Issue to examine the models with this in mind.

wendellpiez commented 4 years ago

Now thinking for the renaming, we wouldn't actually need to identify the parent, only the (nominal) root of the model ... still.

butler54 commented 4 years ago

@wendellpiez - happy to participate in some experiments on this issue. From a python front we've started working on some oscal object model / workflow's here: https://github.com/IBM/compliance-trestle - If we can define an experiment I can see what an implementation would look like.

wendellpiez commented 4 years ago

@butler54 awesome glad to hear this. @david-waltermire-nist will also be in the loop since he has more experience with schema-object mappings than I do.

What would a filter look like in Python, that would be able to label or annotate any node in the OSCAL with its nominal root and/or schema ('flavor')? In XSLT (where I live) this would be a near-identity transformation. You would probably want it not just to add a property to nodes, but rename them, for purposes of disambiguation as the data is marshalled into object space. (So a catalog_part would be something different from a profile_part.) Kind of a poor-man's namespace. An analogous alpha-rename could be performed on the metaschema for naming the corresponding types.

The metaschemas all have a mandatory short-name field intended to offer a string useful for such purposes. I guess it would be a bug to try and consolidate two metaschemas with the same short-name. But it would help make the code a little less cryptic than an arbitrary and opaque prefix designed not to clash.

But here I am really just thinking aloud. I can imagine others imagining entirely different approaches. It's reassuring to look at IBM/compliance-trestle#46 (after writing this) and see you are thinking similarly.

david-waltermire commented 3 years ago

This was addressed by PR #948, which has added support for a combined XML and JSON schema, along with model specific schema views.