butler54 commented 3 years ago

User Story:

As an OSCAL developer I want a consistent method to expand parameter substitutions in various pieces of prose. As the NIST 800-53 catalog was at 1.0.0-rc2 we would have prose similar to this: "prose": "Access authorizations (i.e., privileges) and {{ ac-2_prm_2 }} for each account;" Under that environment a moustache library (https://mustache.github.io/) can be used for parameter substitutions relatively easily. This has been replaced with:

"prose": "Require {{ insert: param, ac-2_prm_1 }} for group and role membership;" Which overloads what is a relatively well established concept requiring custom code for a user parsing OSCAL documents. It also violates the documentation described here: https://pages.nist.gov/OSCAL/reference/datatypes/

My proposal would be to the following: 1) Explicitly limit the object types, key types and corresponding values. (within documentation) that are supported for substitution. 2) Only support textual fields for substitution (e.g. remarks, prose etc.) 3) Assume that the keys to be passed are globally unique within the set of oscal documents once resolved (e.g. at the output of a profile resolution process, if applicable). E.g. if other types are supported (such as props) no scoping is required for substituions.

4) Ideally only allow substitution via parameter objects only.

Goals:

Ensure that parameter substitution, which is currently used within the NIST catalog is as simple as possible for end developers, who can take advantage of established libraries.

Dependencies:

OSCAL 1.0.0 release.

Acceptance Criteria

[ ] All OSCAL website and readme documentation affected by the changes in this issue have been updated. Changes to the OSCAL website can be made in the docs/content directory of your branch.
[ ] A Pull Request (PR) is submitted that fully addresses the goals of this User Story. This issue is referenced in the PR.
[ ] The CI-CD build process runs without any reported errors on the PR. This can be confirmed by reviewing that all checks have passed in the PR.

david-waltermire commented 3 years ago

We changed this syntax prior to the 1.0.0-rc2 release based on usnistgov/metaschema#130.

The documentation in https://pages.nist.gov/OSCAL/reference/datatypes/ does not correctly reflect the current syntax, which needs to be fixed.

We made this change to allow for scoping the type of information to insert, which can be something other than a parameter. We want to keep this capability. We also want to preserve backwards compatibility in future OSCAL releases. Making the change you are suggesting would break backwards compatibility.

Before considering changing the syntax, we should first consider what implementation options there are against the current approach. Are there alternatives to moustache that would support something closer to what we have now?

rgauss commented 3 years ago

Not sure if it's useful, but here's a snippet of how we're currently handling this in JavaScript:

replacedProse = props.prose.replace(
  /\{\{ insert: param, ([0-9a-zA-B-_.]*) \}\}/g,
  getParameterLabel
);

which currently only supports param inserts, but could easily be expanded for other types.

david-waltermire commented 3 years ago

I also implemented this in in FlexMark using a custom set of node parsers.

gregelin commented 3 years ago

During the group meeting, I asked whether the change was conflating data parameter (value) management and templating solutions. Let me restate, hopefully better.

Put simply, extending what can be done with organizational parameters expressed as inline human readable syntax is building upon a weak foundation. Better to revert back to the previous simplified {{ ac-2_prm_2 }} (if matching legacy syntax is that important) and find a different format for expressing organizational (or other) defined values that is more friendly to machine-readability.

While OSCAL probably needs to support backward compatibility with NIST 800-53 text, OSCAL does not need to try and build upon and extend a syntactical aspect of NIST 800-53 document that many find to be a problem in the first place. (I say "probably" because frankly, I think that is the point of revisions is to correct problems. And 800-53's use of inline expression of parameters for human readability has proven itself more problem than benefit. The 800-53 isn't legislation and it's certainly not the US Constitution.)

My context for this is as follows:

Data Model. We are concerned with specifying a field's type and additionally enumerate the valid of values for that field/attribute.

Code (Application Logic). We are concerned with a variable's type and additionally enumerate the valid values. Part of "valid" may also include formatting. We are concerned because reuse of the value in later code may depend on the variable being a certain type (or being valid in some other way).

Templating (Presentation). We are concerned primarily with a parameter's format / presentation (and valid value checking should be done prior to template processing. We are concerned primarily with presentation because reuse of the parameter in later parts of the template depends on a predicable presentation.

Are concerns are similar yet also different at each of the three layers of Data, Application Logic, and Presentation.

The original NIST 800-53 control statements could have been written differently. Instead of having inline substitutions to represent semantic meaning, the 800-53 controls could have been written to describe the controls and then append a specific list of value parameters at the end of the control statement.

Instead of AC-2: "e. Requires approvals by [Assignment: organization-defined personnel or roles] for requests to create information system accounts;"

We could have had instead AC-2: "e. Requires approvals by personnel authorized to create information system accounts;"... "Details: Personnel authorized to create information system accounts: [organization defined]"

The latter format is more machine-friendly, and extensible approach than the former. The latter format immediately implies a name for the parameter which is probably more concise and reusable than the former's generic "organization-defined personnel or roles". The latter is more extensible than the former because what is valid grammar in the latter is more flexible than what would be valid grammar in the former.

FWIW, the double curly-brace {{ param }} syntax in templates is usually a shortcut for the templates formal format {% echo param %} (or {% print param %}). The templating language supports a range of directives, such as {% if a = b %}, {% for i in range %}, {% set x = y %} and so on. The shortcut of {{ param }} exists to reduce key strokes that don't add value. Similarly, a templating language would probably make the format {% insert ac-2_prm_1 %} which would collapse into the original {{ param }} anyway.

Also, th format of {{ action: selector param }} has a number of potential problems. The action and selector become confused with each other as they get extended to larger vocabularies. As selectors become more sophisticated there is a desire for multiple params. One is very quickly writing managing issues related to writing code instead of doing templating.

Summarizing:

I support the reversion. (Indeed, I think VERY few have implemented the new format. It would be great to just change it now if possible.)
The idea of being able to have a more expressive way including "parameters" for machine-management makes sense and should be done in approach that is not built upon and is unencumbered by the historical formatting a text.

david-waltermire commented 3 years ago

@gregelin I appreciate your perspectives on this. I am attempting to clarify a few things below.

During the group meeting, I asked whether the change was conflating data parameter (value) management and templating solutions. Let me restate, hopefully better.

What we have by way of the {{ insert: param, ac-1_prm_1 }} is a templating solution. It is not about parameter value management at all. parameter value management is handled by the param and set-param syntax in OSCAL.

Put simply, extending what can be done with organizational parameters expressed as inline human readable syntax is building upon a weak foundation. Better to revert back to the previous simplified {{ ac-2_prm_2 }} (if matching legacy syntax is that important) and find a different format for expressing organizational (or other) defined values that is more friendly to machine-readability.

The current syntax has been in place since OSCAL 1.0.0 RC2 (see change log). Both this and the prior syntax provide a means of providing a parameter reference within control markup text. This is about capturing the semantics of what is described in the control narrative. From this perspective, both the current syntax {{ insert: param, ac-1_prm_1 }} and {{ ac-1_prm_1 }} are functionally equivalent. The first is explicit in defining that this is an insertion point for a parameter. The second is implicit meaning the same. By your argument, which I don't agree with, both represent a weak foundation.

While OSCAL probably needs to support backward compatibility with NIST 800-53 text, OSCAL does not need to try and build upon and extend a syntactical aspect of NIST 800-53 document that many find to be a problem in the first place. (I say "probably" because frankly, I think that is the point of revisions is to correct problems. And 800-53's use of inline expression of parameters for human readability has proven itself more problem than benefit. The 800-53 isn't legislation and it's certainly not the US Constitution.)

This discussion is not about supporting backwards compatibility with NIST 800-53 text. It's about supporting backwards compatibility with an OSCAL feature included in OSCAL 1.0.0.

I'd rather not rely on generalizations like "many". What we have here is a classic difference of opinion between multiple OSCAL stakeholder groups. Some like the old, implicit, pre-OSCAL 1.0.0 RC2 syntax, while other prefer the more explicit version.

If functionally all things are the same, this debate is easy to resolve by keeping the current syntax. This has the advantage of preserving backwards compatibility with content based on OSCAL 1.0.0. There are a few implementations that already deal with this (e.g., @rgauss, @david-waltermire-nist).

My context for this is as follows:

Data Model. We are concerned with specifying a field's type and additionally enumerate the valid of values for that field/attribute.

The data model for a parameter is orthogonal to this discussion. Types, valid values, etc. are handled by the actual parameter definition, not these insertion points. However, this is not germane to this debate since this applies to both syntax variants.

Code (Application Logic). We are concerned with a variable's type and additionally enumerate the valid values. Part of "valid" may also include formatting. We are concerned because reuse of the value in later code may depend on the variable being a certain type (or being valid in some other way).

Again, the actual parameter definition constrains what a parameter's value is, not these insertion points. However, this is not germane to this debate since this applies to both syntax variants.

Templating (Presentation). We are concerned primarily with a parameter's format / presentation (and valid value checking should be done prior to template processing.

Agreed.

We are concerned primarily with presentation because reuse of the parameter in later parts of the template depends on a predicable presentation.

Parameter presentation is also contextual. If you are rendering a control catalog PDF, then you need to insert the parameter label. If you are rendering an assessment report, then you might need to insert the parameter value(s). However, this is not germane to this debate since this applies to both syntax variants.

Are concerns are similar yet also different at each of the three layers of Data, Application Logic, and Presentation.

The original NIST 800-53 control statements could have been written differently. Instead of having inline substitutions to represent semantic meaning, the 800-53 controls could have been written to describe the controls and then append a specific list of value parameters at the end of the control statement.

It is not OSCAL's role to dictate to catalog creators how to write their catalogs. OSCAL's role is to provide them a means represent the semantics of how the controls are written in a more machine-friendly form. To "append a specific list of value parameters at the end of the control statement" would change the narrative form of the control, which also then means OSCAL cannot be used to render a PDF that (or other human-readable form) that resembles the original prose. This use case has always been a core use case for the OSCAL catalog model.

Instead of AC-2: "e. Requires approvals by [Assignment: organization-defined personnel or roles] for requests to create information system accounts;"

We could have had instead AC-2: "e. Requires approvals by personnel authorized to create information system accounts;"... "Details: Personnel authorized to create information system accounts: [organization defined]"

The latter format is more machine-friendly, and extensible approach than the former. The latter format immediately implies a name for the parameter which is probably more concise and reusable than the former's generic "organization-defined personnel or roles". The latter is more extensible than the former because what is valid grammar in the latter is more flexible than what would be valid grammar in the former.

See above. This is a decision to be made by the catalog author, not OSCAL. In either case the "[organization defined]" is a reference to a parameter, for which either syntax works.

FWIW, the double curly-brace {{ param }} syntax in templates is usually a shortcut for the templates formal format {% echo param %} (or {% print param %}). The templating language supports a range of directives, such as {% if a = b %}, {% for i in range %}, {% set x = y %} and so on. The shortcut of {{ param }} exists to reduce key strokes that don't add value. Similarly, a templating language would probably make the format {% insert ac-2_prm_1 %} which would collapse into the original {{ param }} anyway.

One big problem here is we don't have a full templating language for use in OSCAL markup. FWIW, I don't think it is in OSCAL's best interest to invent one (or adopt one) at this time. Instead, we should keep it simple for now.

Also, th format of {{ action: selector param }} has a number of potential problems. The action and selector become confused with each other as they get extended to larger vocabularies. As selectors become more sophisticated there is a desire for multiple params.

I do agree here. I don't see a change like adding new actions happing in OSCAL 1.x. We can push this to OSCAL 2.x, for which we can have a deeper conversation about templating and have some history along with a rich set of requirements behind us to make different choices.

One is very quickly writing managing issues related to writing code instead of doing templating.

Yes. I'd like to avoid this too. That is why a selector is limited to a single context, like "param" or "control" for now. This gives us some room to grow, while keeping complexity at a low level.

Summarizing:

* I support the reversion. (Indeed, I think VERY few have implemented the new format. It would be great to just change it now if possible.)

I know of two or three implementations that are based on the OSCAL 1.0.0 approach. FWIW, it's early and I don't think either way has been implemented much. I'd rather keep with our commitment to preserve backwards compatibility from OSCAL 1.0.0 on through all minor releases. With all other things being the same, this is the dividing line for me.

* The idea of being able to have a more expressive way including "parameters" for machine-management makes sense and should be done in approach that is not built upon and is unencumbered by the historical formatting a text.

The overarching philosophy of OSCAL is to meet adopters where they are as much as possible. This means we do need to support "historical formatting of text". As I mentioned before, OSCAL can't change the text in control catalogs. Only catalog authors can do this.

butler54 commented 3 years ago

@david-waltermire-nist - understand where you are with this and we have a solution which is now acceptable for us in trestle.

Looking at your comments above two observations:

While there are usecases where it is beneficial to have a (small scope) of templating in OSCAL. Having though this through a few times I think we would want to limit this to where there are explicit needs (such the items we have today).
When considering building workflows that are anchored in OSCAL, I can see that template engines (in general) may be used in conjunction with those documents. Given this, and reversing my opinion, it might make sense to make the templating mechanisms used within oscal differentiated to other template engines defaults. (Such as having {% oscal %} tags or similar ).
In doing this I'm not suggesting this is an exhaustive search, however, using a rough top 10 to guide where collisions may occur.

aj-stein-nist commented 1 year ago

Seeing that there was meaningful discussion on this approximately two years ago but not much follow up on a model element that is specific, but systemically used, the risk/reward of this backwards breaking change does not appear to be warranted in the 1.x family of releases. As there has not been anyone advocating for this since and implementations have likely stabilized, I will close this as not planned for now. It can be reopened or revisited in a new issue when future versions of OSCAL are considered (2.x, 3.x, 4.x) and stakeholders feel in a future major release, where backwards compatibility is not required, that the community needs to revisit this decision.

usnistgov / OSCAL

Roll back parameter substitution style in NIST 800-53 catalog to be similar to standard 'Moustache' approach #971

User Story:

Goals:

Dependencies:

Acceptance Criteria