Open josephnoctum opened 3 years ago
Found a workaround in one of the earlier tickets https://github.com/splunk/docker-splunk/issues/209 Tested with the Chmod change on splunk/etc/auth and splunk/etc/auth/splunk.secret and that worked so seems there still is a hickup with the user changing when updating a container.
Found a workaround in one of the earlier tickets
209
Tested with the Chmod change on splunk/etc/auth and splunk/etc/auth/splunk.secret and that worked so seems there still is a hickup with the user changing when updating a container.
This has actually led to a lot of problems. So beforehand the owner was polkitd and the group was input for all the objects in the var and etc volumes. Now with the new container, the owner is glee and the group is 41812. The new container is running but over half of the inputs and configuration pages I go to inside the web interface are just showing a "loading" screen with a spinning wheel. So we've now broken functionality of over half the apps I've got installed and I can't go back to the old image because it begins to start and then goes to a "Starting Unhealthy" status with nothing in the logs to indicate why.
For this I think you'll need to shell into the container and set the file permissions for everything related to splunk. once shelled in, check the user that the splunk process has and make sure the owner is set for all files, i.e. chown -R splunk:splunk /opt/splunk
. Most Kubernetes admins running splunk will have an init container that just does this every time, persistent volumes and user permissions/ownership is always a pain when moving stuff around.
The Dockerfile for splunk does set the gid/uid explicitly to 41812- https://github.com/splunk/docker-splunk/blob/develop/splunk/common-files/Dockerfile#L53 but that user may be in use in your environment already?
The chmod 777 suggestion in https://github.com/splunk/docker-splunk/issues/209 really should not be done. That is making the splunk secret world readable and writable. The problem is the owner not the actual permissions on the files.
At this point, I don't even see the reason to use the docker container. It's supposed to make administration more manageable, but every time I've tried to update to the latest version of the container I get the same problem, constant restart. We have a UID of 41812 so can I directly change this in the image, no of course not it's docker. So I write a yaml file and compose up, Doesn't help. I try setting the user in the variables when building the container, doesn't help. I've even completely wiped out the persistent volumes giving up on the idea that I could keep any data around, still never actually starts, starts up for about a minute and then reboots. The logs are completely useless as well
TASK [splunk_common : Check if /opt/splunk/var/lib/splunk/kvstore/mongo/splunk.key exists] * ok: [localhost] Tuesday 27 September 2022 00:10:09 +0000 (0:00:00.328) 0:00:10.821 *** FAILED - RETRYING: Start Splunk via CLI (5 retries left). FAILED - RETRYING: Start Splunk via CLI (4 retries left).
I've never actually seen a solution for anyone who is having this issue in any of the threads and I'm tired of dealing with it. Way more of a hassle than it's worth. I'm just going to use Cribl for heavy forwarders now.
I have an old Splunk image running 7.2.5 as a heavy forwarder I'm trying to upgrade to the latest image. The Docker is using swarm for the orchestrator. After starting the new image it never stays running but is constantly in a cycle of restarting. Looking at the tail of the logs I see
This looks similar to another issue I found here where the problem was with mounting the existing volumes, and the solution was to make some corrections in the kubernetes yaml. I'm not running kubernetes through and haven't found where to correct this issue yet.