opensearch-project / ansible-playbook

🤖 A community repository for Ansible Playbook of OpenSearch Project.
https://opensearch.org/
Apache License 2.0
81 stars 97 forks source link

[BUG] upgrading opensearch cluster (eg. from 2.2.1 to 2.3.0) #95

Open erikrs opened 2 years ago

erikrs commented 2 years ago

Describe the bug

When upgrading opensearch, eg. from 2.2.1 to 2.3.0, opensearch fails to restart with a "jar hell" error

To Reproduce

  1. a full (multi-node) opensearch installation was previously executed, eg os_version: "2.2.1"

  2. Change all.yml: os_version: "2.3.0"

  3. Run the playbook

  4. See the error on the server:

journalctl -u opensearch
Sep 15 11:06:29 [redacted] systemd[1]: Stopping opensearch...
Sep 15 11:06:29 [redacted] systemd[1]: opensearch.service: Deactivated successfully.
Sep 15 11:06:29 [redacted] systemd[1]: Stopped opensearch.
Sep 15 11:06:29 [redacted] systemd[1]: opensearch.service: Consumed 28.873s CPU time.
Sep 15 11:06:29 [redacted] systemd[1]: Started opensearch.
Sep 15 11:06:32 [redacted] opensearch[28098]: WARNING: A terminally deprecated method in java.lang.System has been called
Sep 15 11:06:32 [redacted] opensearch[28098]: WARNING: System::setSecurityManager has been called by org.opensearch.bootstrap.OpenSearch (file:/usr/share/opensearch>
Sep 15 11:06:32 [redacted] opensearch[28098]: WARNING: Please consider reporting this to the maintainers of org.opensearch.bootstrap.OpenSearch
Sep 15 11:06:32 [redacted] opensearch[28098]: WARNING: System::setSecurityManager will be removed in a future release
Sep 15 11:06:35 [redacted] opensearch[28098]: uncaught exception in thread [main]
Sep 15 11:06:35 [redacted] opensearch[28098]: java.lang.IllegalStateException: jar hell!
Sep 15 11:06:35 [redacted] opensearch[28098]: class: org.opensearch.tools.launchers.JvmErgonomics
Sep 15 11:06:35 [redacted] opensearch[28098]: jar1: /usr/share/opensearch/lib/opensearch-launchers-2.3.0.jar
Sep 15 11:06:35 [redacted] opensearch[28098]: jar2: /usr/share/opensearch/lib/opensearch-launchers-2.2.1.jar
Sep 15 11:06:35 [redacted] opensearch[28098]:         at org.opensearch.bootstrap.JarHell.checkClass(JarHell.java:314)
Sep 15 11:06:35 [redacted] opensearch[28098]:         at org.opensearch.bootstrap.JarHell.checkJarHell(JarHell.java:213)
Sep 15 11:06:35 [redacted] opensearch[28098]:         at org.opensearch.bootstrap.JarHell.checkJarHell(JarHell.java:100)
Sep 15 11:06:35 [redacted] opensearch[28098]:         at org.opensearch.bootstrap.Bootstrap.setup(Bootstrap.java:227)
Sep 15 11:06:35 [redacted] opensearch[28098]:         at org.opensearch.bootstrap.Bootstrap.init(Bootstrap.java:404)
Sep 15 11:06:35 [redacted] opensearch[28098]:         at org.opensearch.bootstrap.OpenSearch.init(OpenSearch.java:180)
Sep 15 11:06:35 [redacted] opensearch[28098]:         at org.opensearch.bootstrap.OpenSearch.execute(OpenSearch.java:171)
Sep 15 11:06:35 [redacted] opensearch[28098]:         at org.opensearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:104)
Sep 15 11:06:35 [redacted] opensearch[28098]:         at org.opensearch.cli.Command.mainWithoutErrorHandling(Command.java:138)
Sep 15 11:06:35 [redacted] opensearch[28098]:         at org.opensearch.cli.Command.main(Command.java:101)
Sep 15 11:06:35 [redacted] opensearch[28098]:         at org.opensearch.bootstrap.OpenSearch.main(OpenSearch.java:137)
Sep 15 11:06:35 [redacted] opensearch[28098]:         at org.opensearch.bootstrap.OpenSearch.main(OpenSearch.java:103)
Sep 15 11:06:35 [redacted] opensearch[28098]: For complete error details, refer to the log at /usr/share/opensearch/logs/aiv-cluster.log

Playbook Name Specify the Playbook which is affected?

Screenshots N/A

Host/Environment (please complete the following information):

Additional context

probably because the files from the new tar are extracted to existing os_home dir, and land next to older existing files, there's this "jar hell" problem ?

prudhvigodithi commented 1 year ago

Adding @rodolfovillordo can you add your thoughts here?

prudhvigodithi commented 1 year ago

Hey @erikrs do you continue to see this error?

Kampfmoehre commented 1 year ago

I just encountered the same tried to upgrade old install and I think the problem relates to #25

prudhvigodithi commented 1 year ago

Hey @Kampfmoehre thanks, can you please contribute with the fix? @peterzhuamazon @bbarani

Kampfmoehre commented 1 year ago

I don't have a fix unfortunately. I opened the ticket one year ago when we wanted to replace Elastic Search with OpenSearch but we had difficulties with it. So we stick with ES until now where I try once again to make OS work. This time I just erased the whole OpenSearch directory before installing it again. I don't thing that is feasible for this playbook but I also don't know enough of OpenSearch to know which files should be cleaned before updating it. I remember from last time that clearing the plugin directory alone was not enough, though I don't remember exactly what other directories where the problem.

dezzzm commented 11 months ago

Hello. I got a similar problem when upgrading from 2.8.0 to 2.10.0. Cleaning up 3 directories helped me.

You can add a task to the beginning of main.yml:

- name: Clear opensearch directoryes
  ansible.builtin.file:
    state: absent
    path: "{{ item }}"
  with_items:
    - /usr/share/opensearch/lib/
    - /usr/share/opensearch/plugins/
    - /usr/share/opensearch/modules/