akamai / uls

Unified Log Streamer (ULS)
Apache License 2.0
28 stars 10 forks source link

[BUG] ULS stops collecting #51

Closed lwhitworth closed 5 months ago

lwhitworth commented 10 months ago

Describe the bug We're seeing the ULS collector (running in a docker container) just stopping processing of events. Restarting the container seems to make it start again. When it stop it throws the following:

2023-11-02 15:00:56,332 ULS E UlsInputCli - CLI process [20] was found stale - Reason:  "2023-11-02 15:00:51,170 cli-etp MainThread E API call failed with HTTP/400: b'{
  "type": "https://problems.luna.akamaiapis.net/-/pep-authn/request-error",
  "title": "Bad request",
  "status": 400,
  "detail": "Invalid timestamp",
  "instance": "https://xxxx.luna.akamaiapis.net/etp-report/v3/configs/xxxx/threat-events/details",
  "method": "POST",
  "serverIp": "xxx.xxx.xxx.250",
  "clientIp": "xxx.xxx.xxx.215",
  "requestId": "xxxx",
  "requestTime": "2023-11-02T15:00:51Z"
}'

Traceback (most recent call last):
  File "/opt/akamai-uls/uls/ext/cli-etp/bin/akamai-etp", line 738, in <module>
    main()
  File "/opt/akamai-uls/uls/ext/cli-etp/bin/akamai-etp", line 669, in main
    fetch_events_concurrent(config, out)
  File "/opt/akamai-uls/uls/ext/cli-etp/bin/akamai-etp", line 376, in fetch_events_concurrent
    fetch_event_page(start, end, 1, concurrent_fetch, pool_futures, stats, output)
  File "/opt/akamai-uls/uls/ext/cli-etp/bin/akamai-etp", line 304, in fetch_event_page
    total_records = response_data["pageInfo"]["totalRecords"]
                    ~~~~~~~~~~~~~^^^^^^^^^^^^
" 

To Reproduce
Run container and wait

Expected behavior
Container runs without dying, or at least self heals

ULS Version output

Akamai Unified Log Streamer Version information
ULS Version     1.7.1

EAA Version     0.6.3
ETP Version     0.4.5
MFA Version     0.1.1
GC Version      0.0.1-beta
LINODE Version      0.0.1-aplpha

OS Plattform        Linux-5.4.0-166-generic-x86_64-with-glibc2.36
OS Version      5.4.0-166-generic
Python Version      3.12.0
Container Status    True
RootPath        /opt/akamai-uls/uls
TimeZone (UTC OFST)     UTC (0.0)
Installation ID     RFVMT1RGLTIwMjMxMDEyLTEuNy4x

Additional context
It's been running fine since implemented a number of months, don't know if the latest container version has introduced the bug, going to try downgrading it and seeing if we stabilise for now

MikeSchiessl commented 10 months ago

Hi @lwhitworth, thanks for raising this bug. We've already spotted this one and created a ticket on the underlying CLI. You can track the CLI fix over here:

https://github.com/akamai/cli-etp/issues/14

Once the CLI has been fixed, you should be able to pull the fixced version in the ext directory and this bug should be gone. I'll leave this ticket open until we have a confirmed fix on this.

MikeSchiessl commented 5 months ago

@bitonio , please confirm but i think this bug will be fixed with the latest ETP CLI version which will result in the following ULS version: v1.7.3 (feel free to grab the current development branch that already should have the fix inside)

bitonio commented 5 months ago

@MikeSchiessl correct, this has been fixed in cli-etp 0.4.6

MikeSchiessl commented 5 months ago

Hi @lwhitworth ,

sorry for the fix taking so long. We just released ULS v1.7.3 which contains the fix for the bug. Please make sure you update the CLI-ETP version also to 0.4.6 if you're NOT using the dockerized version.