akamai / uls

Unified Log Streamer (ULS)
Apache License 2.0
30 stars 10 forks source link

Log Ingestion volumes decreased #52

Closed chris-bristow-maersk closed 9 months ago

chris-bristow-maersk commented 12 months ago

Our log ingestion has dropped significantly since 10am yesterday 14th Nov And actually completely stopped yesterday afternoon with only some logs early this morning.

Currently our VM shows an event rate of zero:

Nov 15 09:43:16 akamaietpvm1 python3[6600]: {"dt": "2023-11-15T09:43:16.207996", "uls_product": "ETP", "uls_feed": "DNS", "uls_output": "UDP", "uls_version": "1.6.6", "uls_runtime": 59700, " event_count": 144172, "event_count_interval": 0, "event_ingested_interval": 0, "event_bytes_interval": 0, "event_rate": 0.0, "mon_interval": 300}

We were up and around 2000: Nov 14 10:24:00 akamaietpvm1 python3[2015]: {"dt": "2023-11-14T10:24:00.190979", "uls_product": "ETP", "uls_feed": "DNS", "uls_output": "UDP", "uls_version": "1.6.6", "uls_runtime": 342601, "event_count": 388998178, "event_count_interval": 550000, "event_ingested_interval": 550000, "event_bytes_interval": 843302048, "event_rate": 1833.33, "mon_interval": 300} Nov 14 10:29:00 akamaietpvm1 python3[2015]: {"dt": "2023-11-14T10:29:00.191224", "uls_product": "ETP", "uls_feed": "DNS", "uls_output": "UDP", "uls_version": "1.6.6", "uls_runtime": 342901, "event_count": 389708178, "event_count_interval": 710000, "event_ingested_interval": 710000, "event_bytes_interval": 1087779589, "event_rate": 2366.67, "mon_interval": 300}

We have checked the VM uls service is all up and running VM was rebooted 5pm yesterday so service was stopped and restarted as part of the reboot

Expected behavior
Log ingestion was much higher previously, as shown per screenshot attached

Screenshots
Log ingestion volumes over last few days as a comparison

ULS Version output
[root@akamaietpvm1 logcollector]# /root/uls/bin/uls.py --version Akamai Unified Log Streamer Version information ULS Version1.6.6

EAA Versionn/a ETP Version0.4.2 MFA Versionn/a GC Versionn/a LINODE Versionn/a

OS PlattformLinux-4.18.0-305.88.1.el8_4.x86_64-x86_64-with-redhat-8.4-Ootpa OS Version4.18.0-305.88.1.el8_4.x86_64 Python Version3.6.8 Container StatusFalse RootPath /root/uls TimeZone (UTC OFST) UTC (0.0) Installation ID VFo5RlMxLTIwMjMwOTExLTEuNi42



![Picture1](https://github.com/akamai/uls/assets/101122906/8a1771d3-eb9f-4b76-b045-e8aee788808d)
MikeSchiessl commented 12 months ago

Hi @chris-bristow-maersk , we are investigating in this case - probably it is linked to an incident that is currently ongoing.

I'll leave this open unless we know for sure.

chris-bristow-maersk commented 11 months ago

Logs started flowing around 23:00 on 15/11/23 Checking at 14:15 on 15th, we are pulling logs from 5 hours ago, so similar issue to #51 where we cannot pull logs in real time

Nov 16 14:18:32 akamaietpvm1.internal.cloudapp.net {"id": "xxxx", "configId": "xxxxx", "hitCount": 1, "alexaRanking": 1000, "query": {"time": "2023-11-16T09:18:52Z", "

MikeSchiessl commented 11 months ago

Hi @chris-bristow-maersk, the SI should be fully resolved by now. Are you still seeing delays or issues ?

Best regards Mike

MikeSchiessl commented 11 months ago

Hi @chris-bristow-maersk , just another ping - is this solved for you ?

chris-bristow-maersk commented 11 months ago

Hi Mike

Things are looking ok thank you for asking. Last 14 days, logs seem stable, and currently difference is 6 minutes, so all good

@.***

Thanks rgds

MikeSchiessl commented 9 months ago

Thx @chris-bristow-maersk - closing down the ticket ;)