opensearch-project / performance-analyzer-rca

The Performance Analyzer RCA is a framework that builds on the Performance Analyzer engine to support root cause analysis (RCA) of performance and reliability problems for OpenSearch instances.
https://opensearch.org/docs/latest/monitoring-plugins/pa/rca/index/
Apache License 2.0
29 stars 56 forks source link

[BUG] Performance Analyzer webserver on port 9600 not responding to any API calls (caused by JDK upgrade?) #545

Open borutlukic opened 5 months ago

borutlukic commented 5 months ago

What is the bug? Opensearch performance analyzer app throws exception on startup and is not working.

How can one reproduce the bug? Steps to reproduce the behavior:

  1. Run opensearch from docker image: opensearchproject/opensearch:2.12.0 expose port 9600
  2. GET localhost:9600/_plugins/_performanceanalyzer/metrics
  3. Hangs indefinitely (aka, no response is ever returned)

What is the expected behavior? Expected is that the webserver API on port 9600 returns a response.

What is your host/environment?

Do you have any additional context? See logs/PerformanceAnalyzer.log and notice:

Exception in thread "main" java.lang.IllegalArgumentException: cannot add context to list
        at jdk.httpserver/sun.net.httpserver.ContextList.add(ContextList.java:37)
        at jdk.httpserver/sun.net.httpserver.ServerImpl.createContext(ServerImpl.java:276)
        at jdk.httpserver/sun.net.httpserver.HttpServerImpl.createContext(HttpServerImpl.java:74)
        at jdk.httpserver/sun.net.httpserver.HttpServerImpl.createContext(HttpServerImpl.java:39)
        at org.opensearch.performanceanalyzer.PerformanceAnalyzerApp.createClientServers(PerformanceAnalyzerApp.java:354)
        at org.opensearch.performanceanalyzer.PerformanceAnalyzerApp.createClientServers(PerformanceAnalyzerApp.java:319)
        at org.opensearch.performanceanalyzer.PerformanceAnalyzerApp.main(PerformanceAnalyzerApp.java:112)

Checking the source code it is clear that the same handler is added twice, so the createContext functions throws an exception on the second attempt. Code from PerformanceAnalyzerApp.java:

            httpServer.createContext(Util.METRICS_QUERY_URL, queryMetricsRequestHandler);
            httpServer.createContext(
                    Util.LEGACY_OPENDISTRO_METRICS_QUERY_URL, queryMetricsRequestHandler);

The Util.METRICS_QUERY_URL and Util.LEGACY_OPENDISTRO_METRICS_QUERY_URL are equal.

tophercullen commented 5 months ago

Ran into this issue with the official docker containers. Tried 2.12 and 2.13, same problem. PA isn't usable. 2.11.1 appears to work fine.

I feel like this is a config issue, but I don't know where. Its hard to imagine that no one has even tried to use PA in two whole releases.

borutlukic commented 5 months ago

It is not a configuration issue. The code is broken. I think it started showing because JRE has been upgraded in 2.12 container. The old JRE had a bug in httpServer and it did not perform as per documentation so the above code worked. But in latest JRE they have fixed the issue, and adding same context twice will now throw exception on the second attempt. The 2 variables that define the context are hardcoded to same value in code and not configurable.

ronnybremer commented 4 months ago

same issue in OpenSearch 2.13, docker compose

tophercullen commented 4 months ago

@ronnybremer @borutlukic As remediation for the JDK 21 issue with opensearch, we downgraded the JDK version used in the docker image to 17. Performance Analyzer now appears to work....?

Example Dockerfile

FROM  docker.io/opensearchproject/opensearch:2.12.0

USER 0
RUN dnf upgrade -y --refresh && dnf install -y java-17-amazon-corretto
USER 1000
ENV JAVA_HOME=/usr
dblock commented 2 months ago

Catch All Triage - 1 2 3 4 5 6

kranthikirang commented 2 weeks ago

This seems to be an issue in 2.16.0 as well. Any workarounds? or when can this fix is going to be available to OSS stream? Appreciate your help on this.

rishabh6788 commented 6 days ago

Having the same exact issue in OS-2.15, resolved by setting java to jdk-17.