splunk / splunk-connect-for-syslog

Splunk Connect for Syslog
Apache License 2.0
148 stars 107 forks source link

SC4S does not recognized correctly VMWare vSphere logs #2390

Closed ivanfr90 closed 1 week ago

ivanfr90 commented 3 months ago

Was the issue replicated by support? No

What is the sc4s version ? sc4s version=3.22.2

Is there a pcap available? No

Is the issue related to the environment of the customer or Software related issue? Unknown

Is it related to Data loss, please explain ? Seems there no data loss, but data is parcialy classified. The unclassified data are being indexing under sc4s:fallback

Last chance index/Fallback index? Not classified data are being sending to fallback

Is the issue related to local customization? No

Do we have all the default indexes created? Yes

Describe the bug We are integrating logs from VMWare vSphere, Some logs are correctly sent to the defalt index: index=infraops sourcetype=vmware:esxlog:

Configuration files:

env_file

# AUTOMATIC PARSER CONFIGURATION
#REF: https://splunk.github.io/splunk-connect-for-syslog/main/sources/vendor/VMWare/vsphere/ 
#Do not enable with a SNAT load balancer
SC4S_USE_NAME_CACHE=yes
#Combine known split events into a single event for Splunk
SC4S_SOURCE_VMWARE_VSPHERE_GROUPMSG=yes
#Learn vendor product from recognized events and apply to generic events
#for example after the first vpxd event sshd will utilize vps "vmware_vsphere_nix_syslog" rather than "nix_syslog"
SC4S_USE_VPS_CACHE=yes
# Enable a TLS port for this specific vendor product using a comma-separated list of port numbers
SC4S_LISTEN_VMWARE_VSPHERE_TLS_PORT=514

app_parser

application app_vps_vmware[sc4s-vps] {
    filter {
        host("172.16.12.3") or host("172.16.12.4")
    };

    parser {
        p_set_netsource_fields(
            vendor('vmware')
            product('vsphere')
        );
        };    
};

Screenshot of categorized VMWare events under vmware:esxlog:

image

Screenshot of uncategorized VMWare events:

image

mstopa-splunk commented 3 months ago

@ivanfr90 can you copy paste one example for fallback and one for vmware:syslog MESSAGE=* sample for me to check? It looks like framed events sent to 514[tcp|udp], I want to check 601/tcp on my side

for context please see examples in this test for another vendor: https://github.com/splunk/splunk-connect-for-syslog/blob/428d0d6e3c897310e25e14ad50773966bfc022d1/tests/test_netwrix_epp.py

ikheifets-splunk commented 3 months ago

Hello, @ivanfr90 !

First of all problem here: SC4S_LISTEN_VMWARE_VSPHERE_TLS_PORT=514

You need to provide non-default port for VMWARE_VSPHERE,you should provide unique port to identifying your logs asVMWARE_VSPHERE, all the messages that will go on this port will be identity as VMWARE_VSPHERE. But you using default port by syslog RFC.

Second thing:

application app_vps_vmware[sc4s-vps] {
    filter {
        host("172.16.12.3") or host("172.16.12.3")
    };

    parser {
        p_set_netsource_fields(
            vendor('vmware')
            product('vsphere')
        );
        };    
};

Please check hostname, I think it shouldn't be IP, it should be hostname. You can look on docs that you can use host and netmask. If you using IP probably net mask would be relevant for you

ivanfr90 commented 3 months ago

Hello @mstopa-splunk

You answeded so quirkly so while was preparing some extraction yo answeded again :P

Anyway some fallback message extractions:

MESSAGE=352<14>1 2024-04-05T12:11:17.663758+02:00 host-xx vpxd 53432 - - Event [7521282] [1-1] [2024-04-05T10:11:09.551951Z] [vim.event.UserLogoutSessionEvent] [info] [svcmonitoringagent] [PLE54] [7521282] [User svcmonitoringagent@172.17.0.40 logged out (login time: Friday, 05 April, 2024 10:11:08 AM, number of API invocations: 0, user agent: VI Perl)]

MESSAGE=241 <78>1 2024-04-05T12:11:01.404402+02:00 host-xx CROND 18423 - - (root) CMD (. /etc/profile.d/VMware-visl-integration.sh; /usr/lib/applmgmt/backup_restore/scripts/xxx.py >>/var/log/vmware/applmgmt/xxx.log 2>&1)

MESSAGE=172 <30>1 2024-04-05T12:10:55.167107+02:00 host-xx vmcad - - - t@140422747780864: VMCACheckAccessKrb: Authenticated user xxx.local@vsphere.local

MESSAGE=161 <134>1 2024-04-05T12:10:44.180979+02:00 host-xx dnsmasq - - - Apr 5 12:10:44 dnsmasq[1835]: forwarded xxx.local to 172.16.1.2

MESSAGE=306 <134>1 2024-04-05T12:10:43.151825+02:00 host-xx vsan-health-main - - - 2024-04-05T12:10:43.151+02:00 info vsanvcmgmtd[09806] [vSAN@6876 sub=AdapterServer opId=sps-Main-954309-883-311843-6a18] Invoking 'getAlarm' on 'vsanvp-notification-manager' session '52084687-0896-67ef-2sd3-4ecccafb8820' active 1

First thing: Ok, I will talk with customer to change the specific port of VMWare to other.

Second thing I tried also the following:

netmask(172.16.12.3/32) or netmask(172.16.12.4/32)

but seems ignored also.

mstopa-splunk commented 3 months ago

You actually had double support, my colleague and I noticed your issue the same time and you got two responses at once :) All right let me check

mstopa-splunk commented 3 months ago

@ivanfr90 take a look:

not framed - port 514/[tcp|udp]

log_messages = [
    "<14>1 2024-04-05T13:05:00.663758+02:00 host-xx vpxd 53432 - -  Event [7521282] [1-1] [2024-04-05T10:11:09.551951Z] [vim.event.UserLogoutSessionEvent] [info] [svcmonitoringagent] [PLE54] [7521282] [User svcmonitoringagent@172.17.0.40 logged out (login time: Friday, 05 April, 2024 10:11:08 AM, number of API invocations: 0, user agent: VI Perl)]",
    "<78>1 2024-04-05T13:05:00.404402+02:00 host-xx CROND 18423 - -  (root) CMD (. /etc/profile.d/VMware-visl-integration.sh; /usr/lib/applmgmt/backup_restore/scripts/xxx.py >>/var/log/vmware/applmgmt/xxx.log 2>&1)",
    "<30>1 2024-04-05T13:05:00.167107+02:00 host-xx vmcad - - -  t@140422747780864: VMCACheckAccessKrb: Authenticated user xxx.local@vsphere.local",
    "<134>1 2024-04-05T13:05:00.180979+02:00 host-xx dnsmasq - - - Apr  5 12:10:44 dnsmasq[1835]: forwarded xxx.local to 172.16.1.2",
    "<134>1 2024-04-05T13:05:00.151825+02:00 host-xx vsan-health-main - - - 2024-04-05T12:10:43.151+02:00 info vsanvcmgmtd[09806] [vSAN@6876 sub=AdapterServer opId=sps-Main-954309-883-311843-6a18] Invoking 'getAlarm' on 'vsanvp-notification-manager' session '52084687-0896-67ef-2sd3-4ecccafb8820' active 1"
]

for log_message in log_messages:

    print(log_message)

    # change IP to your SC4S instance
    server_address = ("0.0.0.0", 514)

    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as client_socket:
        client_socket.connect(server_address)
        client_socket.sendall(log_message.encode())
<14>1 2024-04-05T13:05:00.663758+02:00 host-xx vpxd 53432 - -  Event [7521282] [1-1] [2024-04-05T10:11:09.551951Z] [vim.event.UserLogoutSessionEvent] [info] [svcmonitoringagent] [PLE54] [7521282] [User svcmonitoringagent@172.17.0.40 logged out (login time: Friday, 05 April, 2024 10:11:08 AM, number of API invocations: 0, user agent: VI Perl)]
<78>1 2024-04-05T13:05:00.404402+02:00 host-xx CROND 18423 - -  (root) CMD (. /etc/profile.d/VMware-visl-integration.sh; /usr/lib/applmgmt/backup_restore/scripts/xxx.py >>/var/log/vmware/applmgmt/xxx.log 2>&1)
<30>1 2024-04-05T13:05:00.167107+02:00 host-xx vmcad - - -  t@140422747780864: VMCACheckAccessKrb: Authenticated user xxx.local@vsphere.local
<134>1 2024-04-05T13:05:00.180979+02:00 host-xx dnsmasq - - - Apr  5 12:10:44 dnsmasq[1835]: forwarded xxx.local to 172.16.1.2
<134>1 2024-04-05T13:05:00.151825+02:00 host-xx vsan-health-main - - - 2024-04-05T12:10:43.151+02:00 info vsanvcmgmtd[09806] [vSAN@6876 sub=AdapterServer opId=sps-Main-954309-883-311843-6a18] Invoking 'getAlarm' on 'vsanvp-notification-manager' session '52084687-0896-67ef-2sd3-4ecccafb8820' active 1

image

framed -- port 601/tcp

for log_message in log_messages:

    log_message = f"{len(log_message)} {log_message}"
    print(log_message)

    # change IP to your SC4S instance
    server_address = ("0.0.0.0", 601)

    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as client_socket:
        client_socket.connect(server_address)
        client_socket.sendall(log_message.encode())
344 <14>1 2024-04-05T13:05:00.663758+02:00 host-xx vpxd 53432 - -  Event [7521282] [1-1] [2024-04-05T10:11:09.551951Z] [vim.event.UserLogoutSessionEvent] [info] [svcmonitoringagent] [PLE54] [7521282] [User svcmonitoringagent@172.17.0.40 logged out (login time: Friday, 05 April, 2024 10:11:08 AM, number of API invocations: 0, user agent: VI Perl)]
209 <78>1 2024-04-05T13:05:00.404402+02:00 host-xx CROND 18423 - -  (root) CMD (. /etc/profile.d/VMware-visl-integration.sh; /usr/lib/applmgmt/backup_restore/scripts/xxx.py >>/var/log/vmware/applmgmt/xxx.log 2>&1)
141 <30>1 2024-04-05T13:05:00.167107+02:00 host-xx vmcad - - -  t@140422747780864: VMCACheckAccessKrb: Authenticated user xxx.local@vsphere.local
126 <134>1 2024-04-05T13:05:00.180979+02:00 host-xx dnsmasq - - - Apr  5 12:10:44 dnsmasq[1835]: forwarded xxx.local to 172.16.1.2
300 <134>1 2024-04-05T13:05:00.151825+02:00 host-xx vsan-health-main - - - 2024-04-05T12:10:43.151+02:00 info vsanvcmgmtd[09806] [vSAN@6876 sub=AdapterServer opId=sps-Main-954309-883-311843-6a18] Invoking 'getAlarm' on 'vsanvp-notification-manager' session '52084687-0896-67ef-2sd3-4ecccafb8820' active 1

image

See how messages and sourcetypes changed comparing to the ones in your printcreen, the vps filter also worked: image

mstopa-splunk commented 3 months ago

I'm closing this since we solved the fallback issue. For support please open a Splunk support ticket, and for bugs or enhancements please open a new issue, in both cases feel free to refer to this issue

ivanfr90 commented 3 months ago

Hi @mstopa-splunk

I see that for your test you are using standard ports: 514 and 601. In the current SC4S logs are sent using TLS to port 514, but following the recommendations of @ikheifets-splunk now customer is trying to switch to another random port different to standard, ¿could be this configuration affecting in some way to the classification of events?

thanks!!

ivanfr90 commented 2 months ago

Hi @mstopa-splunk It's possible share the SC4S complete config used in your automated test? In our system part of messages continues being not classified and are being indexed in fallbak and sincerelly I cannot find where the problem is (I have other sources working fine).

My current config:

_envfile

SC4S_DEST_SPLUNK_HEC_DEFAULT_URL=https://splunk.instance.com
SC4S_DEST_SPLUNK_HEC_DEFAULT_TOKEN=435345345-344c-42ca-836c-32323434
# Uncomment the following line if using untrusted SSL certificates
#SC4S_DEST_SPLUNK_HEC_DEFAULT_TLS_VERIFY=no
SC4S_SOURCE_TLS_ENABLE=yes
#SC4S_ARCHIVE_GLOBAL=yes

# AUTOMATIC PARSER CONFIGURATION VSPHERE | REF: https://splunk.github.io/splunk-connect-for-syslog/main/sources/vendor/VMWare/vsphere/ 
#Do not enable with a SNAT load balancer
SC4S_USE_NAME_CACHE=yes
# Combine known split events into a single event for Splunk
SC4S_SOURCE_VMWARE_VSPHERE_GROUPMSG=yes
# Learn vendor product from recognized events and apply to generic events
#for example after the first vpxd event sshd will utilize vps "vmware_vsphere_nix_syslog" rather than "nix_syslog"
SC4S_USE_VPS_CACHE=yes
# Enable a TLS port for this specific vendor product using a comma-separated list of port numbers
SC4S_LISTEN_VMWARE_VSPHERE_TLS_PORT=4514

_local/config/app_parsers/app_vps_vmwarev2.conf

application app_vps_vmware[sc4s-vps] {
    filter {
        (netmask(172.16.12.3/32) or netmask(172.16.12.4/32))
    };

    parser {
        p_set_netsource_fields(
            vendor('vmware')
            product('vsphere')
        );
        };    
};

_local/context/splunkmetadata.csv vmware_vsphere_nix_syslog,index,infraops

mstopa-splunk commented 2 months ago

hi @ivanfr90 I used a basic configuration: SC4S_DEST_SPLUNK_HEC_DEFAULT_URL, SC4S_DEST_SPLUNK_HEC_DEFAULT_TOKEN, SC4S_DEST_SPLUNK_HEC_DEFAULT_TLS_VERIFY=no. Please open a Splunk support ticket, because it's not an SC4S bug or enhancement request

mstopa-splunk commented 1 month ago

Hi @ivanfr90 thank you for your help here, in fact SC4S should have been extended with new programs: https://github.com/splunk/splunk-connect-for-syslog/pull/2462/files .

To process octet-counting events please create a port dedicated to Vmware vsphere but on RFC6587:

SC4S_LISTEN_VMWARE_VSPHERE_RFC6587_PORT=4514

Please use it only for framed events. They should no longer go to fallback. Some of them will be still classified as nix:syslog in the released SC4S versions. Please try the PR image instead: ghcr.io/splunk/splunk-connect-for-syslog/container3lite:pr-2462 which should classify all vsphere logs correctly:

image

Please let me know if you find any missing cases, else we will include it in the release in 2 weeks.

Also, it should be possible to turn off octet counting on the source side

rjha-splunk commented 1 week ago

It will be released with next feature release.