Closed jason-callaway closed 3 years ago
After comparing the results from the parser, its appears we are missing the following controls (35) in the graph database:
Going to look into the ETL parser as to why this is occurring.
You know, those controls looked weird to me. So I did a quick search for them.
MATCH (c:control) WHERE (c:control OR c:family) AND c.name IN ["AP-1", "AP-2", "AR-1", "AR-2", "AR-3", "AR-4", "AR-5", "AR-6", "AR-7", "AR-8", "DI-1", "DI-1(1)", "DI-1(2)", "DI-2", "DI-2(1)", "DM-1", "DM-2", "DM-2(1)", "DM-3", "DM-3(1)", "IP-1", "IP-1(1)", "IP-2", "IP-3", "IP-4", "IP-4(1)", "SE-1", "SE-2", "TR-1", "TR-1(1)", "TR-2", "TR-2(1)", "TR-3", "UL-1", "UL-2"] RETURN c.name
0 results. So now we know why there're not in the graph. Great!
New question: what the heck are they? Maybe deprecated controls? Those should still be in 800-53, though.
Ooooh. We're missing the 800-53 appendices.
https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-53r4.pdf
So how should we graph these? As a family under 800-53? Or its own regime? I feel like the former will be easier to use.
Yeah i agree, a family under 800-53 makes the most sense.
The odd part is these controls are listed in the tsv extract for CNSSI 1253 Privacy Overlay but arent resident in the graph as you noted above.
MATCH (r:regime {name: "CNSSI 1253"})-[:HAS*]->(b:baseline {name: "Privacy"}) WITH b MATCH (b)-[:REQUIRES]->(c:control) RETURN b,c.name
Right. We build 800-53 from the tsv at https://nvd.nist.gov/800-53. So what's missing are the appendices, which aren't in the tsv, but are in the pdf. It looks like Appendix J might be the only one containing new controls.
But this makes me worried that we're missing controls from other 800-53 related docs like 800-53A.
Hm, so I'm thinking about where these should live in the graph. One option would be:
MATCH (r:regime {name: 'NIST 800-53'})
MERGE (r)-[:HAS]->(b:baseline {name: 'Appendix J'}) WITH b
MERGE (b)-[:HAS]->(f:family {name: 'AP', description: 'Authority and Purpose'}) WITH f
MERGE (f)-[:HAS]->(c:control {name: 'AP-1', description: 'Authority to Collect'}) WITH f
MERGE (f)-[:HAS]->(c:control {name: 'AP-2', description: 'Purpose Specification'})
But I could also see putting these families under the 800-53 regime itself. If people expect them to be (regime)->(family)->(control) this would be confusing.
This is one of the handful of grievances we have from NIST and their ability to "present" versus "supply" us the data.
The controls mentioned above are the privacy controls for that respective office. I read somewhere that OSCAL is supposed to include these controls from other appendices, but I still hold skeptical as I did with SCAP.
My general rule of thumb is, if NIST doesn't provide the controls then we're not missing them. It's too much of a hassle to massage the data when the data custodian is responsible for that. But if we include them, then into its own regime.
Actually, now that I think about it more, I think it should look more like this.
MATCH (r:regime {name: 'NIST 800-53'})
MERGE (r)-[:HAS]->(f:family {name: 'AP', description: 'Authority and Purpose'}) WITH f
MERGE (f)-[:HAS]->(c:control {name: 'AP-1', description: 'Authority to Collect'}) WITH f
MERGE (f)-[:HAS]->(c:control {name: 'AP-2', description: 'Purpose Specification'});
MATCH (r:regime {name: 'NIST 800-53'})
MERGE (r)-[:HAS]->(b:baseline {name: 'Appendix J'});
MATCH (r:regime {name: 'NIST 800-53'})-[:HAS]->(b:baseline {name: 'Appendix J'}) WITH b
MATCH (r:regime {name: 'NIST 800-53'})-[*..3]->(c:control {name: 'AP-1'}) WITH b, c
MERGE (b)-[:DEFINES]->(c);
MATCH (r:regime {name: 'NIST 800-53'})-[:HAS]->(b:baseline {name: 'Appendix J'}) WITH b
MATCH (r:regime {name: 'NIST 800-53'})-[*..3]->(c:control {name: 'AP-2'}) WITH b, c
MERGE (b)-[:DEFINES]->(c);
This way we still capture the fact that Appendix J is defining these controls, but they'll live where people expect in the (regime)->(family)->(control) hierarchy.
@elavenrac want to take a crack at the ETL for this? If you want to pair-program it, we could do that too.
@jason-callaway i can take an initial stab at the ETL pipeline. Would be helpful to chat briefly first if you can just to make sure we're on the same page.
Resolved by PR #23