canonical / opensearch-snap

OpenSearch Snap
Apache License 2.0
1 stars 6 forks source link

Snap will not pickup reset kernel parameters after reboot. #39

Closed ethanmye-rs closed 3 months ago

ethanmye-rs commented 1 year ago

Steps to reproduce

  1. Run tutorial, using lxd backing cloud.
  2. Reboot
  3. Set kernel parameters with sudo sysctl -w vm.max_map_count=262144 vm.swappiness=0 net.ipv4.tcp_retries2=5 as they were reset after reboot
  4. Verify kernel parameters actually changed: sudo sysctl -a | grep -E 'swappiness|max_map_count|tcp_retries2'
  5. Error message: Juju stuck with 'net.ipv4.tcp_retries2 should be 5'

I've exposed the opensearch service, but I don't think that matters here.

Expected behavior

The tutorial (and my experience) says this should work OK. The other parameters are successfully set. A reboot of the container results in the same error message.

Actual behavior

Juju does not recognize net.ipv4.tcp_retries2 as being reset. I've tried rebooting the containers without any luck. I suspect this is more of a juju problem than an opensearch problem, but it would be nice to figure out how to unstick juju.

Versions

opensearch active 1 opensearch 2/edge 26 yes Operating system: 22.04

Happy to provide any logs required.

github-actions[bot] commented 1 year ago

https://warthogs.atlassian.net/browse/DPE-2170

Mehdi-Bendriss commented 3 months ago

Hi @ethanmye-rs Thanks for reporting! and apologies for the delay in responding - we just go to clean up old issues etc..

Those kernel parameters are ephemeral and get reset after a reboot when set using sysctl. Should you want these to be persisted across reboots you need to set them in /etc/sysctl.conf, i.e:

On the host:

sudo tee -a /etc/sysctl.conf > /dev/null <<EOT
vm.max_map_count=262144
vm.swappiness=0
net.ipv4.tcp_retries2=5
fs.file-max=1048576
EOT

sudo sysctl -p

For the juju units when creating the model:

cat <<EOF > cloudinit-userdata.yaml
cloudinit-userdata: |
  postruncmd:
    - [ 'echo', 'vm.max_map_count=262144', '>>', '/etc/sysctl.conf' ]
    - [ 'echo', 'vm.swappiness=0', '>>', '/etc/sysctl.conf' ]
    - [ 'echo', 'net.ipv4.tcp_retries2=5', '>>', '/etc/sysctl.conf' ]
    - [ 'echo', 'fs.file-max=1048576', '>>', '/etc/sysctl.conf' ]
    - [ 'sysctl', '-p' ]
EOF

juju model-config --file=./cloudinit-userdata.yaml

We had updated the README - but looks like we missed the discourse docs, which I updated now.

Please reopen this issue if it persists.