splunk / splunk-operator

Splunk Operator for Kubernetes
Other
206 stars 114 forks source link

App Framework: Applications not being reinstalled if node with ebs was deleted #1067

Open yaroslav-nakonechnikov opened 1 year ago

yaroslav-nakonechnikov commented 1 year ago

Please select the type of request

Enhancement

Tell us more

Describe the request I see, that after terminating pod (by whatever reason) doesn't invoke app reinstalls till new version is published.

Expected behavior When any pod starts - it checks for apps, and if needed - install them.

Splunk setup on K8S k8s on eks

Reproduction/Testing steps install some application thru appframework delete pod with storage wait till it starts

Proposed changes(optional) would be cool if it will start to work as creating list on defaults.yml, where it is possible to define what should be installed. maybe some flag: reinstall_apps_on_creation: <bool>, which will force installing apps on true, and leave as is on false.

yaroslav-nakonechnikov commented 1 year ago

also,

kubectl patch cm/splunk-splunk-operator-manual-app-update --type merge -p '{"data":{"ClusterManager":"status: on\nrefCount: 1"}}'

doesn't do the magic, apps are not being installed.

yaroslav-nakonechnikov commented 1 year ago

is there any command which can be triggered to force install apps?

sgontla commented 1 year ago

@iaroslav-nakonechnikov , in the case of Search Head Cluster OR Indexer Cluster, on a pod termination, when a new pod is started, it will get the app packages thorough the bundles. So, the App Framework needs to have persistent storage for both the deployer and the Cluster Manager, considering the reset scenarios.

So, same is the case in the case of standalone(assuming, this is the case you are exploring?). App install status is tracked per pod in the case of standalone, so, once a particular app is marked as installed, the next trigger will be only on the app package update. So, persistent volume is a pre-requisite in this case as well.

yaroslav-nakonechnikov commented 1 year ago

@sgontla i understand what you are saying. It will work, till there are no issues with persistent storage. But this persistent storage might broke. For example, because of human error.

So, there need a way to force reinstalling all applications.

marcispauls commented 1 year ago

I would agree with @iaroslav-nakonechnikov infra breaks time to time and there is no way to force re-sync of apps.

yaroslav-nakonechnikov commented 1 year ago

how is it going? we need a way to reinstall apps

DrMeosch commented 6 months ago

Hi guys,

I have the following use-case. I run Splunk as a Distributed Clustered Deployment (Single Site) on AKS using Azure Blob as SmartStore. When the node is gone and the pods are recreated, then an app with the SmartStore configuration is not installed. That way the indexes are also not there at all and the indexers throwing "..Received event for unconfigured/disabled/deleted index.." errors. That happens despite having Persistent Storage for CM.

Probably, being able to setup SmartStore config directly in the helm chart as well as having a way to reinstall the apps would avoid this issue.

I hope to hear from you soon. Have a nice day!