Open yaroslav-nakonechnikov opened 1 year ago
according to documentation: https://splunk.github.io/docker-splunk/ADVANCED.html#:~:text=The%20purpose%20of%20the%20default.yml%20is%20to%20define,members%20of%20the%20cluster%20%28ex.%20keys%2C%20passwords%2C%20secrets%29. : there is nice example:
password: "{{ splunk_password | default(<password>) }}"
so i thought some jinja functions should work...
and trying to do something like:
fqdn : "{% if getenv("SPLUNK_ROLE") == "splunk_search_head" %}https://shc.${splunk_domain}{% else %}https://shc-deployer.${splunk_domain}{% endif %}"
and it is not working, because of:
yaml.scanner.ScannerError: while scanning for the next token found character '%' that cannot start any token in "<unicode string>", line 27, column 21:
fqdn : {% if getenv("SPLUNK_ROLE") == "s ... ^
[WARNING]: * Failed to parse /opt/ansible/inventory/environ.py with ini plugin: /opt/ansible/inventory/environ.py:16: Expected key=value host variable assignment, got: __future__
[WARNING]: Unable to parse /opt/ansible/inventory/environ.py as an inventory source
i found workaround:
fqdn : >
{% if lookup('ansible.builtin.env', 'SPLUNK_ROLE') == "splunk_search_head" %}
https://she.${splunk_domain}
{% else %}
https://she-deployer.${splunk_domain}
{% endif %}
but, i would like to know if there any better official way to do that
CSPL-2152
this one becomes critical.
real case: when there is a list of apps to be installed, deployer requires a lot of time to make pod in Running state. and defining StartupProbe with big timeout - affects also searchead nodes, which leads that sh nodes can't get IP assigned, till startup probe will start to work.
if increase threshold - it will lead to another issue, that real issue won't be detected fast enough.
and another thing found.
when deployer starts, it connects to deployment server and download apps. In that time nodes are passing further and then deployer stucks on:
TASK [splunk_deployer : Wait for SHC to be ready] ******************************
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: Exception: SHC failure, setup notcomplete. online_peers:['05BB34E3-9B8F-4916-A60A-493D4534047F', 'B7CE74CB-138D-4E4F-9A6C-B4DB791C155D']
fatal: [localhost]: FAILED! => {
"attempts": 60,
"changed": false,
"rc": 1
}
MSG:
MODULE FAILURE
See stdout/stderr for the exact error
MODULE_STDERR:
Traceback (most recent call last):
File "/home/splunk/.ansible/tmp/ansible-tmp-1709656969.8714278-4953-235691734405253/AnsiballZ_shc_ready.py", line 100, in <module>
_ansiballz_main()
File "/home/splunk/.ansible/tmp/ansible-tmp-1709656969.8714278-4953-235691734405253/AnsiballZ_shc_ready.py", line 92, in _ansiballz_main
invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)
File "/home/splunk/.ansible/tmp/ansible-tmp-1709656969.8714278-4953-235691734405253/AnsiballZ_shc_ready.py", line 41, in invoke_module
run_name='__main__', alter_sys=True)
File "/usr/lib/python3.7/runpy.py", line 205, in run_module
return _run_module_code(code, init_globals, run_name, mod_spec)
File "/usr/lib/python3.7/runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/tmp/ansible_shc_ready_payload_nh5z9sh5/ansible_shc_ready_payload.zip/ansible/modules/shc_ready.py", line 55, in <module>
File "/tmp/ansible_shc_ready_payload_nh5z9sh5/ansible_shc_ready_payload.zip/ansible/modules/shc_ready.py", line 50, in main
File "/tmp/ansible_shc_ready_payload_nh5z9sh5/ansible_shc_ready_payload.zip/ansible/modules/shc_ready.py", line 37, in run
Exception: SHC failure, setup not complete. online_peers:['05BB34E3-9B8F-4916-A60A-493D4534047F', 'B7CE74CB-138D-4E4F-9A6C-B4DB791C155D']
PLAY RECAP *********************************************************************
localhost : ok=137 changed=20 unreachable=0 failed=1 skipped=64 rescued=0 ignored=0
executing splunk resync shcluster-replicated-config
manually on deployer allows to pass this check.
Please select the type of request
Bug
Tell us more
Describe the request At the moment definition to create SearchHeads is written in single CRD, which creates 4 pods at minimum:
we are using PingID integration, which requires to define
fqdn
setting, where browser redirects after successful login. By default, hostname is being used, which is not accessible from client PC's.and if we provide
defaults.yml
with correct setup of PindID config withfqdn
- here problem arises: only one domain name is defined.Expected behavior There should be way to define default config for searchhead deployer and custom config for searchhead nodes.
Splunk setup on K8S EKS on AWS
Reproduction/Testing steps with next
defaults.yml
submited to searchheads crd - only one fqdn is possible to use.So, problem, that we need to know how to provide:
fqdn : https://shc-deployer.26981.cmp-prj-dev.internal.cmpgroup.cloud
- for deployerfqdn : https://shc.26981.cmp-prj-dev.internal.cmpgroup.cloud
- for searcheads