redteam-project / sckg

Security Control Knowledge Graph
GNU General Public License v3.0
39 stars 16 forks source link

CNSSI 1253 Privacy Overlay unittest fails #16

Closed jason-callaway closed 3 years ago

jason-callaway commented 4 years ago
======================================================================
FAIL: test_control_count_cnssi_pii (test_regime_etl.TestConfigYaml)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/jasoncallaway/PycharmProjects/sckg/tests/test_regime_etl.py", line 81, in test_control_count_cnssi_pii
    self.assertEquals(r[0], r[1])
AssertionError: 231 != 129

----------------------------------------------------------------------
elavenrac commented 4 years ago

After comparing the results from the parser, its appears we are missing the following controls (35) in the graph database:

Going to look into the ETL parser as to why this is occurring.

jason-callaway commented 4 years ago

You know, those controls looked weird to me. So I did a quick search for them.

MATCH (c:control) WHERE (c:control OR c:family) AND c.name IN ["AP-1", "AP-2", "AR-1", "AR-2", "AR-3", "AR-4", "AR-5", "AR-6", "AR-7", "AR-8", "DI-1", "DI-1(1)", "DI-1(2)", "DI-2", "DI-2(1)", "DM-1", "DM-2", "DM-2(1)", "DM-3", "DM-3(1)", "IP-1", "IP-1(1)", "IP-2", "IP-3", "IP-4", "IP-4(1)", "SE-1", "SE-2", "TR-1", "TR-1(1)", "TR-2", "TR-2(1)", "TR-3", "UL-1", "UL-2"] RETURN c.name

0 results. So now we know why there're not in the graph. Great!

New question: what the heck are they? Maybe deprecated controls? Those should still be in 800-53, though.

jason-callaway commented 4 years ago

Ooooh. We're missing the 800-53 appendices.

https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-53r4.pdf

Screen Shot 2020-06-29 at 8 08 23 PM

Screen Shot 2020-06-29 at 8 08 28 PM

So how should we graph these? As a family under 800-53? Or its own regime? I feel like the former will be easier to use.

elavenrac commented 4 years ago

Yeah i agree, a family under 800-53 makes the most sense.

elavenrac commented 4 years ago

The odd part is these controls are listed in the tsv extract for CNSSI 1253 Privacy Overlay but arent resident in the graph as you noted above.

MATCH (r:regime {name: "CNSSI 1253"})-[:HAS*]->(b:baseline {name: "Privacy"}) WITH b MATCH (b)-[:REQUIRES]->(c:control) RETURN b,c.name

jason-callaway commented 4 years ago

Right. We build 800-53 from the tsv at https://nvd.nist.gov/800-53. So what's missing are the appendices, which aren't in the tsv, but are in the pdf. It looks like Appendix J might be the only one containing new controls.

But this makes me worried that we're missing controls from other 800-53 related docs like 800-53A.

jason-callaway commented 4 years ago

Hm, so I'm thinking about where these should live in the graph. One option would be:

MATCH (r:regime {name: 'NIST 800-53'})
MERGE (r)-[:HAS]->(b:baseline {name: 'Appendix J'}) WITH b
MERGE (b)-[:HAS]->(f:family {name: 'AP', description: 'Authority and Purpose'}) WITH f
MERGE (f)-[:HAS]->(c:control {name: 'AP-1', description: 'Authority to Collect'}) WITH f
MERGE (f)-[:HAS]->(c:control {name: 'AP-2', description: 'Purpose Specification'})

Screen Shot 2020-06-29 at 8 44 52 PM

But I could also see putting these families under the 800-53 regime itself. If people expect them to be (regime)->(family)->(control) this would be confusing.

trevorbryant commented 4 years ago

This is one of the handful of grievances we have from NIST and their ability to "present" versus "supply" us the data.

The controls mentioned above are the privacy controls for that respective office. I read somewhere that OSCAL is supposed to include these controls from other appendices, but I still hold skeptical as I did with SCAP.

My general rule of thumb is, if NIST doesn't provide the controls then we're not missing them. It's too much of a hassle to massage the data when the data custodian is responsible for that. But if we include them, then into its own regime.

jason-callaway commented 4 years ago

Actually, now that I think about it more, I think it should look more like this.

MATCH (r:regime {name: 'NIST 800-53'})
MERGE (r)-[:HAS]->(f:family {name: 'AP', description: 'Authority and Purpose'}) WITH f
MERGE (f)-[:HAS]->(c:control {name: 'AP-1', description: 'Authority to Collect'}) WITH f
MERGE (f)-[:HAS]->(c:control {name: 'AP-2', description: 'Purpose Specification'});

MATCH (r:regime {name: 'NIST 800-53'})
MERGE (r)-[:HAS]->(b:baseline {name: 'Appendix J'});

MATCH (r:regime {name: 'NIST 800-53'})-[:HAS]->(b:baseline {name: 'Appendix J'}) WITH b
MATCH (r:regime {name: 'NIST 800-53'})-[*..3]->(c:control {name: 'AP-1'}) WITH b, c
MERGE (b)-[:DEFINES]->(c);

MATCH (r:regime {name: 'NIST 800-53'})-[:HAS]->(b:baseline {name: 'Appendix J'}) WITH b
MATCH (r:regime {name: 'NIST 800-53'})-[*..3]->(c:control {name: 'AP-2'}) WITH b, c
MERGE (b)-[:DEFINES]->(c);

Screen Shot 2020-07-01 at 5 59 33 PM

This way we still capture the fact that Appendix J is defining these controls, but they'll live where people expect in the (regime)->(family)->(control) hierarchy.

@elavenrac want to take a crack at the ETL for this? If you want to pair-program it, we could do that too.

elavenrac commented 3 years ago

@jason-callaway i can take an initial stab at the ETL pipeline. Would be helpful to chat briefly first if you can just to make sure we're on the same page.

jason-callaway commented 3 years ago

Resolved by PR #23