Open yaroslav-nakonechnikov opened 3 weeks ago
it was already reported before #1260, and then there were 2 calls, where i described why logic with dependency is broken for kubernetes deployment.
and now can test 2.6.1 and we still see, that part of platform can't be started just because of problematic logic. old case: 3448046
@yaroslav-nakonechnikov we will get back to you with regard to this issue.
so, sadly, this is extremely painful, as there maybe issues like that:
FAILED - RETRYING: [localhost]: Initialize SHC cluster config (2 retries left).
FAILED - RETRYING: [localhost]: Initialize SHC cluster config (1 retries left).
TASK [splunk_search_head : Initialize SHC cluster config] **********************
fatal: [localhost]: FAILED! =>
{ "attempts": 60, "changed": false, "cmd": [ "/opt/splunk/bin/splunk", "init", "shcluster-config", "-auth", "admin:j3Q9SWJlLBOlc3RWejMnUb6e", "-mgmt_uri", "https://splunk-e-43345-search-head-1.splunk-e-43345-search-head-headless.splunk-operator.svc.cluster.local:8089", "-replication_port", "9887", "-replication_factor", "3", "-conf_deploy_fetch_url", "https://splunk-e-43345-deployer-service:8089", "-secret", "RNr25biFMA4Z3SUbXB3VGwW6", "-shcluster_label", "she_cluster" ], "delta": "0:00:00.806237", "end": "2024-10-31 08:05:54.588881", "rc": 24, "start": "2024-10-31 08:05:53.782644" }
STDERR:
WARNING: Server Certificate Hostname Validation is disabled. Please see server.conf/[sslConfig]/cliVerifyServerName for details.
Login failed
MSG:
non-zero return code
PLAY RECAP *********************************************************************
localhost : ok=132 changed=11 unreachable=0 failed=1 skipped=68 rescued=0 ignored=0
problem in that section, as i understand: https://github.com/splunk/splunk-ansible/blob/53a9a70897896e279b43478583b13256e75894a2/roles/splunk_search_head/tasks/search_head_clustering.yml#L6
and search heads in infinitive loop, which leads to none of indexers are started.
it happened on splunk-operator 2.6.1 and splunk 9.1.6
extremly strange that standalone instance started without issues:
NAME READY STATUS RESTARTS AGE
splunk-43345-cluster-manager-0 1/1 Running 1 (70m ago) 79m
splunk-43345-license-manager-0 1/1 Running 0 79m
splunk-c-43345-standalone-0 1/1 Running 0 79m
splunk-e-43345-deployer-0 0/1 Running 0 66m
splunk-e-43345-search-head-0 0/1 Running 3 (14m ago) 65m
splunk-e-43345-search-head-1 0/1 Running 3 (14m ago) 65m
splunk-e-43345-search-head-2 0/1 Running 3 (14m ago) 65m
splunk-operator-controller-manager-5c684d667d-smgdq 2/2 Running 0 80m
and with this test i can confirm, that 9.1.6 is not working at all.
Please select the type of request
Bug
Tell us more
Describe the request
and then:
this is unbelievable, and extremely strange that still, in 2.6.1 there is a dependency check between splunk search-heads and indexers!!!!
Expected behavior Indexers should start without dependency of search-heads!