muchdogesec / arango_cti_processor

A small script that creates relationships between common CTI knowledge-bases in STIX 2.1 format.
https://www.dogesec.com/
Apache License 2.0
3 stars 0 forks source link

Aging out of _is_ref objects is not correct #20

Closed himynamesdave closed 1 month ago

himynamesdave commented 3 months ago

TEST 10.0: Test IGNORE_EMBEDDED_RELATIONSHIPS = false

Uses cve-cpe test data (same import as test 6.0)

python3 -m unittest tests/test_10_0_ignore_embedded_relationships_f.py

TEST 10.1: Test update to objects where IGNORE_EMBEDDED_RELATIONSHIPS = false

This time uses 6.1 test import.

python3 -m unittest tests/test_10_1_ignore_embedded_relationships_f.py

TEST 10.2: Test removed objects where IGNORE_EMBEDDED_RELATIONSHIPS = false

This time uses 6.2 test import.

python3 -m unittest tests/test_10_2_ignore_embedded_relationships_f.py

Issue

in test 10, one of the SROs generated by arango_cti_processor is marked as _is_latest = false

Thus, the _is_ref SROs for the relationship should also be aged out

Debug

This is actually a core stix2arango issue

https://github.com/muchdogesec/stix2arango/issues/22

fqrious commented 2 months ago

I think this is already fixed in s2a, I'm just waiting for it to be merged so I can adjust on this side...

himynamesdave commented 2 months ago

I just merged your fixes in s2a to main branch.

https://github.com/muchdogesec/stix2arango/pull/24

himynamesdave commented 1 month ago

blocked by #29

himynamesdave commented 1 month ago

@fqrious i've updated the tests for this, but now 10 is failing to generate any _is_ref relationships

python3 -m unittest tests/test_10_0_ignore_embedded_relationships_f.py
fqrious commented 1 month ago

This is because of #26 where you're using the wrong filter and you said it's not wrong so I had to modify the code such that embedded refs no longer use the same _arango_cti_processor_note as the ref object.. the query should not filter by cve-cpe but "embedded-ref+cve-cpe" instead

As for the first test, it'll still fail because 20 relationships are generated and 20*3 is 60

himynamesdave commented 1 month ago

This is because of https://github.com/muchdogesec/arango_cti_processor/issues/26 where you're using the wrong filter and you said it's not wrong so I had to modify the code such that embedded refs no longer use the same _arango_cti_processor_note as the ref object.. the query should not filter by cve-cpe but "embedded-ref+cve-cpe" instead

Can you walk me through this. I still dont understand what this has to do with #26

https://github.com/muchdogesec/arango_cti_processor/issues/26#issuecomment-2376719092

- [4848]
+ [1212] 

Test gave 4848 whilst file only had 1212 objects (vertex).

But this had nothing to do with embedded refs. This was just a test counting vertex objects.

Can you walk me through exactly what is going wrong here, and what I am misunderstanding?

I had to modify the code such that embedded refs no longer use the same _arango_cti_processor_note as the ref object.. the query should not filter by cve-cpe but "embedded-ref+cve-cpe" instead

can you explain why this was needed? I want to understand why we need it. Maybe the answer to the above will clear that up

As for the first test, it'll still fail because 20 relationships are generated and 20*3 is 60

can you explain this?

The test is specifically filtering is refs, arango cti processed makes 2 cwe-capec SROs, inside these SROs are a total of 3 embedded relationships (each has 1 created_by_ref and 2 object_marking_refs) so expect 6 total

Note, the test ignores any SROs is_ref created earlier by stix2arango (on import)

python3 -m unittest tests/test_10_0_ignore_embedded_relationships_f.py
    def test_01_count_is_ref(self):
        query = """
        RETURN COUNT(
          FOR doc IN mitre_cwe_edge_collection
            FILTER doc._arango_cti_processor_note == "cwe-capec"
            AND doc._is_ref == true
            RETURN doc
        )

I can't see how 60 results would occur

fqrious commented 1 month ago

But this had nothing to do with embedded refs. This was just a test counting vertex objects.

No, this is assertion causing failure in the issue, it's clearly not counting vertex objects... Why would it even be counting vertex objects when ACP doesn't create vertex objects at all?

can you explain why this was needed? I want to understand why we need it. Maybe the answer to the above will clear that up

Because when I explained the issue here and here you refused to update the test

can you explain this?

The test is counting embedded relationships, it wrongfully assumed there are only 6 relationships generated when there are actually 20

himynamesdave commented 1 month ago

But this had nothing to do with embedded refs. This was just a test counting vertex objects.

No, this is assertion causing failure in the issue, it's clearly not counting vertex objects... Why would it even be counting vertex objects when ACP doesn't create vertex objects at all?

OK this is actually my error. The test is wrong. Basically I've misread the query. I am going to fix this.

I am going to review all tests and come back once I've triple checked my logic

himynamesdave commented 1 month ago

OK so I realised my error -- I didnt include _is_ref = false in the tests not concerned with embedded refs. Now added.

https://github.com/muchdogesec/arango_cti_processor/commit/bc5d499d7f40e9f25c4eda727bc1541b34a3c668

This means

I had to modify the code such that embedded refs no longer use the same _arango_cti_processor_note as the ref object.. the query should not filter by cve-cpe but "embedded-ref+cve-cpe" instead

We no longer need to do this, and can go back to using the old _arango_cti_processor_note value (to match to source object) for embedded refs.

BUT...

I am not sure this explain test 10 / 11

The test is counting embedded relationships, it wrongfully assumed there are only 6 relationships generated when there are actually 20

Lets use test 10.0 as an example...

python3 -m unittest tests/test_10_0_ignore_embedded_relationships_f.py

It imports tests/files/cwe-condensed.json

It creates 2 SROs for cwe-capec

                {
                    "source_name": "capec",
                    "url": "https://capec.mitre.org/data/definitions/1.html",
                    "external_id": "CAPEC-1"
                },
                {
                    "source_name": "capec",
                    "url": "https://capec.mitre.org/data/definitions/180.html",
                    "external_id": "CAPEC-180"
                }

So we have 2 SROs for this link that are _is_ref=false (each of these has 3 embedded refs, inside these SROs are a total of 3 embedded relationships (each has 1 created_by_ref and 2 object_marking_refs) so expect 6 total that are _is_ref == true and _arango_cti_processor_note == cwe-capec ).

Where do the other 14 objects come from? (I might be missing something key here, but it's really not clear to me @fqrious ).

fqrious commented 1 month ago

Where do the other 14 objects come from? (I might be missing something key here, but it's really not clear to me @fqrious ).

Don't worry about it, I was using an old version of tests

himynamesdave commented 1 month ago

@fqrious then I think this ticket should be resolved (that is, tests will work) when we revert to making _arango_cti_processor_note for refs inherit the same value as the object it was generated from.

himynamesdave commented 1 month ago

@fqrious now all test but 10.2 are failing

it appears the aging out of is_ref is not working correctly

TEST 10.2: Test removed objects where IGNORE_EMBEDDED_RELATIONSHIPS = false

Removes the added capec in 10.1, so now 2 capecs (and 6 embedded refs)

python3 -m unittest tests/test_10_2_ignore_embedded_relationships_f.py