CERT-Polska / karton

Distributed malware processing framework based on Python, Redis and S3.
https://karton-core.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
381 stars 45 forks source link

Bug: Matching a negated filter value makes Task.matches_filters fail regardless of other filters #246

Closed ups1decyber closed 5 months ago

ups1decyber commented 6 months ago

Hi,

I think there is a problem with this line in the Task's matches_filters function: https://github.com/CERT-Polska/karton/blob/master/karton/core/task.py#L238

I think the idea behind this line is that filters such as

        filters = [
            {
                "type": "sample",
                "platform": "!macos"
            },
            {
                "type": "sample",
                "platform": "!win*"
            }
        ]

don't make the Task's matches_filters function match for any arbitrary value of platform.

But I think this is a problem. Let's look at these example filters:

        filters = [
            {
                "type": "sample",
                "platform": "win32"
            },
            {
                "type": "different",
                "platform": "!win32"
            }
        ]

For the following three tasks, I would expect that the first one matches, the second one does not, and the third one does. But the first one does not:

task_sample = Task(headers={
    "type": "sample",
    "platform": "win32"
})

task_different_win32 = Task(headers={
    "type": "different",
    "platform": "win32"
})

task_different_win64 = Task(headers={
    "type": "different",
    "platform": "win64"
})

The problem is that the match of platform: win32 causes the referenced line to return False regardless of other filters.

But I think the core problem here is that the current task header matching does not allow for exclusion of multiple values for a specified field using a single filter. If this is addressed, then there would be no need for the return statement in line 238. This could be one of the issues with header matching that could be addressed by #245.

Please consider adding the following test case to tests/test_task_filters.py:

    def test_negated_filter2(self):  # maybe find a better function name
        filters = [
            {
                "type": "sample",
                "platform": "win32"
            },
            {
                "type": "different",
                "platform": "!win32"
            }
        ]

        task_sample = Task(headers={
            "type": "sample",
            "platform": "win32"
        })
        self.assertTrue(task_sample.matches_filters(filters))

        task_different_win32 = Task(headers={
            "type": "different",
            "platform": "win32"
        })
        self.assertFalse(task_different_win32.matches_filters(filters))

        task_different_win64 = Task(headers={
            "type": "different",
            "platform": "win64"
        })
        self.assertTrue(task_different_win64.matches_filters(filters))
psrok1 commented 6 months ago

Hi! Good catch, thanks for that finding: negated filters introduced some inconsistencies in matching logic.

I've tried to rethink the matches_filters logic to match the expected semantics and have come with this PR: https://github.com/CERT-Polska/karton/pull/247 I also included your test cases.