splunk / splunk-ansible

Ansible playbooks for configuring and managing Splunk Enterprise and Universal Forwarder deployments
357 stars 186 forks source link

Problem creating tcp inputs since 9.0.0 #699

Open hanswurscht opened 2 years ago

hanswurscht commented 2 years ago

We are using the splunk docker image and are loading addional apps during startup via SPLUNK_APPS_URL.

With splunk 9.0.0, creating a tcp-input in a app doesn't work anymore:

In splunkd.log we get during the ansible run:

08-30-2022 13:03:40.078 +0000 WARN  AdminHandler:TCP [4456 TcpChannelThread] - TCP Server configs not changed. Skipping TCP Server reload. To force reload add requireServerRestart=true arg for endpoints /data/inputs/tcp/raw/_reload, /data/inputs/tcp/cooked/_reload, servicesNS/admin/ssl/_reload and data/inputs/tcp/ssl/_reload. Example /services/data/inputs/tcp/raw/_reload?requireServerRestart=true. Reload SSL certificates without terminating existing connections, 'services/data/inputs/tcp/ssl/_reload?requireServerRestart=true&terminateConnections=false'
hanswurscht commented 1 year ago

Any news on this? We tried with 9.0.4 and this is still broken. We can't update to 9.0 because of this bug.

adityapinglesf commented 1 year ago

@hanswurscht can you share some additional information? which app are you referring to via SPLUNK_APPS_URL? Can you also share the entire docker command used to initiate the Splunk instance?

hanswurscht commented 1 year ago

Steps to reproduce:

  1. create a simple app network_input with the following content default/inputs.conf:
    [tcp://1234]
    index=main
    sourcetype=test
  2. package that app as tar.gz and put it on a webserver:
    tar -cf network_input.tar network_input
    gzip network_input.tar
    python3 -m http.server --bind 0.0.0.0 9000
  3. start Splunk in version 9.x with that app:
    docker run -p 8000:8000 -e "SPLUNK_PASSWORD=<password>"              -e "SPLUNK_START_ARGS=--accept-license" -e "SPLUNK_APPS_URL=http://<put-your-ip-here>:9000/network_input.tar.gz"             -it splunk/splunk:9.0.4.1
  4. Log into that container and check open ports:
    docker exec -it a71079ff0c1d /bin/bash
    [ansible@a71079ff0c1d splunk]$ netstat -ntlp
    netstat: can't scan /proc - are you root?
    Active Internet connections (only servers)
    Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
    tcp        0      0 0.0.0.0:9997            0.0.0.0:*               LISTEN      -
    tcp        0      0 127.0.0.1:8065          0.0.0.0:*               LISTEN      -
    tcp        0      0 0.0.0.0:8000            0.0.0.0:*               LISTEN      -
    tcp        0      0 0.0.0.0:8088            0.0.0.0:*               LISTEN      -
    tcp        0      0 0.0.0.0:8089            0.0.0.0:*               LISTEN      -
    tcp        0      0 0.0.0.0:8191            0.0.0.0:*               LISTEN      -

    As you can see, tcp 1234 is missing int that output.

But when running with version 8.2, the network-input on tcp 1234 is open:

docker run -p 8000:8000 -e "SPLUNK_PASSWORD=<password>"              -e "SPLUNK_START_ARGS=--accept-license" -e "SPLUNK_APPS_URL=http://<your-ip-here>:9000/network_input.tar.gz"             -it splunk/splunk:8.2.10
[...]

docker exec -it a00251f88c57 /bin/bash
[ansible@a00251f88c57 splunk]$ netstat -ntlp
netstat: can't scan /proc - are you root?
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 127.0.0.1:8065          0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:1234            0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:9997            0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:8088            0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:8089            0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:8191            0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:8000            0.0.0.0:*               LISTEN      -
adityapinglesf commented 1 year ago

@hanswurscht We’ve moved away from SPLUNK_APPS_URL in favor of the App Framework we’ve built for k8s. I will get back to you with the workaround.

hanswurscht commented 1 year ago

@hanswurscht We’ve moved away from SPLUNK_APPS_URL in favor of the App Framework we’ve built for k8s. I will get back to you with the workaround.

Any news on this?

dfederschmidt commented 1 year ago

@hanswurscht

I made some observations and was able to reproduce the issue on Linux (recent Ubuntu) described here. Interestingly, I was unable to reproduce this on Docker for Mac (Apple Silicon) with Rosetta x86 emulation. But I don't exactly know whats going on there, so Linux matters more to me.

A potential workaround and probably what will be needed in the provisioning at some point is that there is a restart before completing the provisioning to ensure the port is open.

When I provide SPLUNK_ANSIBLE_POST_TASKS with the following arg, it will trigger a restart at the end of the provisioning. Once provisioning is complete - Splunk is restarted and the port is open.

docker run -p 8000:8000 -p 8089:8089 -e "SPLUNK_PASSWORD=<password>" -e "SPLUNK_START_ARGS=--accept-license" -e "SPLUNK_APPS_URL=http://<your-ip-here>:9000/network_input.tar.gz" -e SPLUNK_ANSIBLE_POST_TASKS=file:///opt/ansible/roles/splunk_common/handlers/restart_splunk.yml -it splunk/splunk:9.0.4.1 

Is that potentially something that could work for your scenario?

hanswurscht commented 1 year ago

Thank you! I tried the workaround and it worked!

We would still like to see a proper solution directly in the ansible project here.