Open gjanders opened 7 months ago
agree, it adds issues.
we define startup probe timeout for 5 mins - it postpone checks, and just after searcheads are online, so ip's are assigned and deployer can work with that. but failureThreshold is set to really high number (50+, depends on period of check), so it allows deployer to finish all tasks.
it has issue, as there will be logs about about non-working deployer, but we ignore it. checking only restart reasons.
Now logged as CSPL-2594
Please select the type of request
Enhancement
Tell us more
Describe the request Currently the readiness probe used in a Splunk search head cluster tests if port 8089 is running, if it is running the instance is "ready", if not it is not ready. However I'd like to have this further customized to ignore nodes that are in manual (or automatic detention).
Expected behavior The probe should check the status of the member, for example it could hit the endpoint https://localhost:8089/services/shcluster/member/ready and a response without errors would be considered successful.
A response such as:
Would be successful/search head is ready for traffic, a response such as:
Would result in that search head not receiving new traffic
Ideally this would be a switch/parameter in case someone wants to send traffic to members in detention.
Splunk setup on K8S Splunk search head clusters will have this feature, and only search head clusters...
Reproduction/Testing steps Any search head cluster has this feature, you can manually put a node in detention as per Put a search head cluster member into detention
K8s environment N/A
Proposed changes(optional) Provide either a flag or a new default that for the SHC CRD the readiness probe checks the search head status and members in manual detention as considered "not ready"
K8s collector data(optional) N/A
Additional context(optional) I've raised the related issue https://github.com/splunk/splunk-operator/issues/1321