imperva / incapsula-logs-downloader

A Python script for downloading log files from Incapsula
MIT License
30 stars 35 forks source link

Fixing the format of the TCP messsages #30

Closed CodingFree closed 9 months ago

CodingFree commented 3 years ago

Hello,

There seems to be a very rare bug in the Python Logging library.

Since the module has supported Syslog for a long time, it meets different specifications. During the evolution of the different specifications, a specific detail has not been supported widely: add a line break at the end of each Syslog message.

This only happens with TCP, is not noticeable with UDP.

If there is no line break, what happens is that the Syslog service concatenates one received message after another. For the first message received, the date and hostname are added at the beginning. But once a new message arrives, it is concatenated after the first one. This results in something similar to this:

[date] [host] [cef1] [cef2] ... [cefN]

Instead of: [date] [host] [cef1] [date] [host] [cef2] [date] [host] [cefN]

This causes it to break the regular expression that Microsoft uses to ingest messages in Azure Sentinel and most of the messages are discarded and lost.

This bug can only be detected by inspecting network traffic, there is not an easy way to debug it in the code, since doing a print(msg) would add a line break, because of how the print function works.

Some services may be interpreting the trailing \000 as the end of a Syslog message, but Microsoft is not doing it like that. That's why I changed the value of syslog.append_nul.

References: