Spiral: Change Request creation for a provenance assembly in mapping.

Compton-US commented 1 year ago

Problem Statement

Belongs to https://github.com/usnistgov/OSCAL-DEFINE/issues/18

Prepare a feature request to add a required provenance assembly to document contextual information and responsibility for the mapping.

aj-stein-nist commented 1 year ago

@Compton-NIST do you need someone to work on this spiral? Does this spiral need to be done directly with or alongside #27? I am ready to volunteer.

Compton-US commented 1 year ago

@aj-stein-nist I don't think they need to be worked together for a successful outcome. I do think based on conversation today, maybe it all ends up in one change request for OSCAL rather than multiple. Just keeps it simpler for us once we go to development.

aj-stein-nist commented 1 year ago

Would you like me to pick this up? When and how?

Compton-US commented 1 year ago

Would you like me to pick this up? When and how?

Let's talk Monday. I want to make a pass through the facilitator guide and update first.

Compton-US commented 1 year ago

Screenshot 2023-08-04 at 3 42 09 PM

Compton-US commented 1 year ago

Screenshot 2023-08-04 at 3 43 46 PM

Compton-US commented 1 year ago

Very, very rough draft. What is needed at the moment is input on allowed values and terms used. Also input on cardinality.

Compton-US commented 1 year ago

Next week I will work on some examples using this assembly.

Compton-US commented 1 year ago

https://github.com/usnistgov/OSCAL/tree/compton-working-provenance

iMichaela commented 1 year ago

IMPORTANT: We cannot use matching in this context since it conflicts with the Profile > import> include-controls > matching

The NISTIR 8278 r1 calls relationship rationale rationale = a set of reasons or a logical basis for a course of action or a particular belief. We labeled the information type matching alluding to the matching type or approach (logical, lexical, semantical, etc). If source-control < equivalent-to > target-control using a functional matching approach vs a semantic matching approach, are we correct considering functional or semantic approach the rational for the equivalent-to relationship? If the answer is "yes", then maybe we should align this work with NIST's previous work and call the property rationale. If the answer is "no" we could call it approach or matching-type .

Should we enforce, through the matching-type property at the document/metadata level, for all relations within one document to be of the same type, or do we allow for the property to be overwritten at the relationship level?

iMichaela commented 1 year ago

Current set theory relationships defined in the experimental model are not suitable for all matching approaches listed: lexical, logical, semantical, syntactical, functional.

The NISTIR 8278 r1 supports set theory based relationships only for syntactic, semantic, and functional approaches. The documents reads:

The basic reason why a Reference Document Element and a Focal Document Element are related is attributed to one of three rationales:

**Syntactic*** – Compares the linguistic meaning of the two elements. For example, the following statements have the same syntax:

printf (“bar”); [... C programming language] 
printf (“bar”); [... C programming language]

Semantic – Compares the contextual meaning of the two elements. For example, the following statements convey the same semantic meaning:

“The organization employs a firewall at the network perimeter.”
“The enterprise uses a device that has a network protection application installed to safeguard the network from intentional or unintentional intrusion.”

Functional – Compares the functions of the two elements. For example, the following statements have the same functional result:

printf (“foo\n”); [... C programming language] 
print “foo” [... BASIC programming language]

Lexical analysis is the process of breaking down a large text into smaller parts, such as words, phrases or symbols, while syntax analysis is the process of understanding how these parts fit together to form meaningful sentences. Lexical decomposition into fine-grain parts of the mapped controls and analysis of the correlation of the source control parts to the target control parts appears to be the most suitable for automation. Set theory can be applied to those fine-grain parts of the source control and the target control in determining the relationship.

Lexical analysis is used in natural language processing (NLP) to break down natural language text into individual words and phrases that can be more easily processed by NLP algorithms.

Syntax analysis is used in natural language processing to analyze and understand the structure of sentences in a language. It helps identify the parts of speech (noun, verb, pronoun, etc.), determines the relationships between the words, and constructs a parse tree that represents the hierarchical structure of the sentence.

Lexical analysis is the first step in natural language processing. It is the process of breaking down a large text into smaller parts, such as words, phrases, or symbols, and assigning them meaning. Next step is syntax analysis which is the process of understanding how words fit together to form meaningful sentences. This is done by using grammar rules, which define the structure of a sentence. For example, in English, grammar rules would determine whether a sentence should have a subject, verb, and object, or if it should be in the active or passive voice. Nice summary is available here A more comprehensive logic analysis material is here. The lecture presents Logic Theory (Aristotelian, Hegel's dialectic, mathematical , logical connectives ) demonstrates that in NLP, logical analysis is essentially a semantic analysis, and syntactical analysis is necessary for semantic analysis.

Logical relationships in a language are briefly described here.

To sum:

lexical analysis breaks down a large text into smaller parts, such as words, phrases or symbols,
syntax analysis is the process of understanding the grammar (what are the words: nouns, verbs, etc., the voice used: passive, active, imperative, etc.), and supports the semantic analysis
semantic analysis helps to determine the meaning of a sentence or phrase.
functional analysis identifies the outcome, the result of the statement
logical analysis is used as a tool for reasoning, combining ideas, deducing new rules

So, in the NISTIR 8278 r1 , the functional example also shows 2 statements that have the same syntax, such as: function(argument).

printf (“foo\n”)
print “foo”

Functionally they produce the same outcome: foo gets printed.

TO CONCLUDE THE RESEARCH: 1) f we keep only the set theory based relationships currently defined in the draft OSCAL Mapping model, then the logical and syntactical matching approaches should be eliminated from the list. We can allow them to be defined outside core OSCAL in association with proper relationships. 2) If we keep the logical and syntactical matching approaches, we need to define proper logical and syntactical relationships and provide clear examples highlighting the differences between them. I personally do not see any strong reason for supporting logical and syntactical (grammar analysis) matching approaches.

Compton-US commented 1 year ago

@iMichaela I placed this feedback into a spiral document.

Feel free to adjust directly in this branch if you want to add or change anything.

iMichaela commented 1 year ago

@Compton-NIST - I wanted us to discuss it first and see if it makes sense to you too. I used those comments to document my thoughts in one place to seed the discussion and share the references. Here is a document for Logical relations amongst sentences. Will use the spiral document and include the latest link there too.

usnistgov / OSCAL-DEFINE

Spiral: Change Request creation for a provenance assembly in mapping. #28

Problem Statement