splunk / splunk-connect-for-syslog

Splunk Connect for Syslog
Apache License 2.0
151 stars 108 forks source link

app-syslog-dell_switch_n catching non powerswitch events #2180

Closed jrehm-mmm closed 9 months ago

jrehm-mmm commented 10 months ago

I have seen this with several different vendor products. Most recently found with Aruba Clearpass, but have seen it happen with vmware_vsphere,isc_bind, and cisco_ios. If an event makes it to app-syslog-dell_switch_n[sc4s-syslog] AND is from a program that contains a capital letter, it will truncted digits from the end of the hostname and prefix it to the message.

Per Ryan F - "I think the parser design is wrong in hindsight and needs to change to vps instead of syslog"

What is the sc4s version ? tried with 2.23 and 3.4.3

Is there a pcap available? No - I have not been able to capture a faulty message. I can get millions of events without any getting caught by app-syslog-dell_switch_n[sc4s-syslog].

Example event

<135>2023-09-27 06:04:24,445 10.1.2.3 CPPM_System_Event 6 1 0 event_source=TACACS+ Server,level=ERROR,category=Request,description=TACACS+ request from unknown NAD=10.198.2.200,action_key=Failed,timestamp=2023-09-24 12:20:28.402-05 **dns lookup result** 10.1.2.3=myhost01 **Splunk Event** host=myhost (missing the 01) _raw=01 CPPM_System_Event 6 1 0 event_source=TACACS+ Server,level=ERROR,category=Request,description=TACACS+ request from unknown NAD=10.198.2.200,action_key=Failed,timestamp=2023-09-24 12:20:28.402-05 **sc4s_tags on the above faulty event** [wireformat:rfc|wireformat:rfc3164|source_identified|wireformat:rfc3164_isodate|.app.app-almost-syslogz-isodate|.app.app-syslog-dell_switch_n|ns_vendor:aruba|ns_product:clearpass|.source.s_ARUBA_CLEARPASS|.app.app-aruba_clearpass-post|vps] **sc4s_tags on an event from the same sc4s_container from the same host to for the same CPPM_System_Event type** [wireformat:rfc|wireformat:rfc3164|source_identified|wireformat:rfc3164_isodate|.app.app-almost-syslogz-isodate|ns_vendor:aruba|ns_product:clearpass|.source.s_ARUBA_CLEARPASS|.app.app-netsource-aruba_clearpass|.app.app-aruba_clearpass-post|vps]
mstopa-splunk commented 9 months ago

Hi @jrehm-mmm. I understand that if an event makes it to app-syslog-dell_switch_n and is from a program that starts with a capital letter, it will be incorrectly split to host and message.

Can you explain the part regarding pcap and example event? You’ve observed this to happen with Aruba Clearpass, vmware and others, but you have not been able to capture faulty messages in pcap right? What about the example event that you provided? I manually sent it to SC4S but I didn’t get the same faulty tags and trunted host as you, it worked fine for me:

sc4s_tags=wireformat:rfc|wireformat:rfc3164|source_identified|wireformat:rfc3164_wrongver|.app.app-almost-syslogz-bsd-wrongver|.app.app-netsource-aruba_clearpass|.source.s_DEFAULT
jrehm-mmm commented 9 months ago

pcap has been tough to capture since we can go more than a day without any events getting tagged with app-syslog-dell_switch_n.

You can recreate it If you send an example message with a program that won't get caught by any other syslog-pgm or syslog filters.

example <135>2023-09-27 06:04:24,445 myhost-01 NOTCPPM_System_Event 6 1 0 event_source=TACACS+ Server,level=ERROR,category=Request,description=TACACS+ request from unknown NAD=10.198.2.200,action_key=Failed,timestamp=2023-09-24 12:20:28.402-05

edit to add: hyphen in host header for sample event

jrehm-mmm commented 9 months ago

also, sorry if that was all unclear. I have compared the sc4s_tags chain and message formats and am unable to find a reason for why sometimes the same message format from the same host gets caught and others it doesn't. Maybe the pcap would show a difference, but post sc4s at least, I cannot find a difference.

I will try and see if I can find a different vendor program that it happens to more frequently that I can capture.

mstopa-splunk commented 9 months ago

@jrehm-mmm yes please keep an eye on that, it would really help us to catch this kind of an issue as we probably will never be able to observe this in dev conditions.

Let's keep this thread open for 10 days in case this happens again, if that's okay for you.

The second example unfortunately didn't reproduce the bug for me, I received host myhost01 in splunk

jrehm-mmm commented 9 months ago

@mstopa-splunk

I just rechecked the app-syslog-dell_switch_n parser and neglected to add a hyphen in my example for hostname. Should have given hostname as myhost-01

regexp-parser(
                prefix(".tmp.")
                template("$HOST")
                patterns('(?<host>.*)-(?<prefix>\d+)$')
            );

Did your event at least get picked up by the app-syslog-dell_switch_n parser?

Regardless, a filter for program('[A-Z]+') is certainly going to get picked up by unintended apps and will keep sc4s from reaching our sc4s-syslog and syslog-netsource parsers when program contains a capital letter.

mstopa-splunk commented 9 months ago

@jrehm-mmm right with hyphen I could reproduce trimmed host and tags wireformat:rfc|wireformat:rfc3164|source_identified|wireformat:rfc3164_wrongver|.app.app-almost-syslogz-bsd-wrongver|.app.app-syslog-dell_switch_n|.source.s_DEFAULT. thank you, I will work on that

mstopa-splunk commented 9 months ago

@jrehm-mmm thank you for reporting this, solved: https://github.com/splunk/splunk-connect-for-syslog/pull/2235/files