sophos / Sophos-Central-SIEM-Integration

Simple integration script for 3rd party systems such as SIEMs. Offers command line, file or syslog output in CEF, JSON or key-value pair formats.
121 stars 70 forks source link

Thousands of duplicate events #50

Closed apreheim closed 3 years ago

apreheim commented 3 years ago

When I run this script it continually pulls events from the last 12 hours which creates thousands of duplicate events in Splunk. It appears that it's not checking the state file since it shows "No datetime found, defaulting to last 12 hours for results" every time I run it.

I'm running Python 3.8, but I've tried Python 3.6 with the same results.

I'm using the instructions provided here to get this data into Splunk: https://splunkbase.splunk.com/app/4647/#/details

apreheim commented 3 years ago

This is the output of siem.py while running:

Config endpoint=/siem/v1/events, filename='/logs/sophos-central-events.txt' and format='json'
No datetime found, defaulting to last 12 hours for results
Fetching the tenants/customers list by calling the Sophos Cental API
fetching access_token from sophos
body :: {'grant_type': 'client_credentials', 'scope': 'token', 'client_id': 'xxxxxx', 'client_secret': 'xxxxxx'}
response :: {'access_token': 'xxxxxx', 'errorCode': 'success', 'expires_in': 3600, 'message': 'OK', 'refresh_token': 'xxxxxx', 'token_type': 'bearer', 'trackingId': 'xxxxxx'}
fetching whoami data
Whoami response: b'{"id":"xxxxxx","idType":"tenant","apiHosts":{"global":"https://api.central.sophos.com","dataRegion":"https://api-us03.central.sophos.com"}}'
URL: https://api-us03.central.sophos.com/siem/v1/events?limit=1000&cursor=xxxxxx
URL: https://api-us03.central.sophos.com/siem/v1/events?limit=1000&cursor=xxxxxx
URL: https://api-us03.central.sophos.com/siem/v1/events?limit=1000&cursor=xxxxxx
URL: https://api-us03.central.sophos.com/siem/v1/events?limit=1000&cursor=xxxxxx
URL: https://api-us03.central.sophos.com/siem/v1/events?limit=1000&cursor=xxxxxx
URL: https://api-us03.central.sophos.com/siem/v1/events?limit=1000&cursor=xxxxxx

Every time I run, I see the message "No datetime found, defaulting to last 12 hours for results" even though the file state/siem_sophos.json exists.

The URLs are exactly the same and they continue to appear (about twice per second) until I kill siem.py. My log file (sophos-central-events.txt) grew to 12 gb when running it for the first time because I didn't realize it was duplicating events.

ramksophos commented 3 years ago

Thanks @apreheim. We are taking a look at this. Can you please paste the contents of the state JSON file with any sensitive information redacted?

apreheim commented 3 years ago

@rkamat Sure thing:

siem_sophos.json

{
    "account": {
        "xxxxxx": {
            "jwt": "xxxxxx",
            "jwtExpiresAt": 1625756976.969127,
            "whoami": {
                "id": "xxxxxx",
                "idType": "tenant",
                "apiHosts": {
                    "global": "https://api.central.sophos.com",
                    "dataRegion": "https://api-us03.central.sophos.com"
                }
            }
        }
    },
    "tenants": {
        "xxxxxx": {
            "eventsLastFetched": "xxxxxx",
            "dataRegionUrl": "https://api-us03.central.sophos.com",
            "lastRunAt": 1625754298.8566716
        }
    }
}
apreheim commented 3 years ago

Notes:

apreheim commented 3 years ago

@rkamat Any other info I can grab for you on this issue?

ramksophos commented 3 years ago

Thanks @apreheim. At this point, we are looking at reproducing and debugging the issue. I'll let you know if we need more info from you. Thanks for your patience and cooperation on this!

reubensammut commented 3 years ago

This issue can easily be fixed by moving the line args = self.get_alerts_or_events_req_args(params) inside the while loop for both make_credentials_request and make_token_request. The params variable gets updated with the new cursor, however args is not updated. So when calling call_endpoint, args would have the old incorrect arguments for the API, hence repeating the request. If you like I can create a pull request for this.