Open itkovian opened 2 months ago
Hi @itkovian ! Thanks for opening this. I might be missing something, but it looks like your input is actually RFC5424 and not RFC3164:
<13>1 2019-02-13T19:48:34+00:00 74794bfb6795 root 8449 - [empty] qwerty
You can tell because it has the version
specifier (1
) after the severity/priority (also the timestamp is RFC5424 format and not RFC3164 format, and it has procid, msgid, etc.). In this case [empty]
is interpreted as structured data per RFC5424. If it isn't structured data, and instead part of the msg
, there should be an extra -
preceding it to indicate that the sd-data is nil.
Yes, the example comes from the tests in your code :)
The problem I am facing is that several applications are logging lines like
2024-09-19T15:39:45.469435+02:00 node3521 slurmstepd[548422]: [65684352.batch] done with job
And these lack a -
.
{
"65684352.batch": {},
"appname": "slurmstepd",
"hostname": "node3521",
"message": "done with job",
"procid": 548422,
"timestamp": "2024-09-19T13:39:45.469435Z"
}
This is correct according to the RFC 5424 as I understand, but I'd like to be able to force it to RFC 3164 instead of 5424.
Yes, the example comes from the tests in your code :)
The problem I am facing is that several applications are logging lines like
2024-09-19T15:39:45.469435+02:00 node3521 slurmstepd[548422]: [65684352.batch] done with job
And these lack a
-
.{ "65684352.batch": {}, "appname": "slurmstepd", "hostname": "node3521", "message": "done with job", "procid": 548422, "timestamp": "2024-09-19T13:39:45.469435Z" }
This is correct according to the RFC 5424 as I understand, but I'd like to be able to force it to RFC 3164 instead of 5424.
Can you post exactly what you would like the resulting message to look like. I think it should be fairly easy to add an extra parameter, but need to make sure it would do what you want.
Sure. I'd like it to match the RFC 3164, resulting in
{
"appname": "slurmstepd",
"hostname": "node3521",
"message": "[65684352.batch] done with job",
"procid": 548422,
"timestamp": "2024-09-19T13:39:45.469435Z"
}
If you can tell me where in the code I should start looking, I can have a try myself.
Cool. The relevant call is here. Currently it passes Variant::Either
. You will need to update it so it can also pass Variant::RFC3164
or Variant::RFC5424
.
You will need to add an optional parameter here. Make it default to either
.
Have a look at the level
parameter for the log
function to get an idea of how to setup an enum parameter that can only accept a limited set of values.
Any questions give me a shout.
Hmm, I see that syslog_loose also tries to get some structured data for RFC3164, even though this should not be supported (calling structured_data_optional(false) here)
I opened an issue for this: https://github.com/StephenWakely/syslog-loose/issues/37
We're getting a bit of junk when processing syslog lines from several applications that implement the old syslog format (RFC 3164) but are actually putting something in
[]
without key-value pairs.I see in the tests that
I am wondering if this is actually the expected behaviour. What would be the rationale behind having empty structured data? Is it not more logical that these kinds of messages actually follow RFC 3164 instead of RFC 5424?
An empty structured message with information that varies wildly (e.g., a job ID (from slurm), some GC time (from Kafka)) means that the data that is shipped off to e.g. Elasticsearch will pollute the fields, as the object key may be interpreted as a fieldname with dynamic mapping turned on.
Is there a way to control this behaviour?