OpenLiberty / open-liberty

Open Liberty is a highly composable, fast to start, dynamic application server runtime environment
https://openliberty.io
Eclipse Public License 2.0
1.16k stars 598 forks source link

Jaeger pod showing significant memory increase when executing app load against AcmeAir using mpTelemetry-1.1, microProfile-4.1 and webProfile-8.0 features. #26977

Open hanczaryk opened 1 year ago

hanczaryk commented 1 year ago

SVT: Jaeger pod showing significant memory increase when executing app load against AcmeAir using mpTelemetry-1.1, microProfile-4.1 and webProfile-8.0 features.

Describe the bug
SVT is testing a 24 hour app load stress run against AcmeAir using mpTelemetry-1.1, microProfile-4.1 and webProfile-8.0 features.

I'm testing using a recent daily openliberty build (wlp-1.0.84.cl231220231115-1101) in which this mpTelemetry-1.1 now tolerates older MP and EE versions.

The jaeger pod memory grows so steadily that it exhausts all memory available on the node. To combat this, I edit the jaeger deployment to specify 8Gi memory limit to avoid the entire node going down.

        resources:
          limits:
            cpu: 768m
            memory: 8Gi
          requests:
            cpu: 768m

With this 8Gi memory limit set, jaeger pod exceeds the memory limit in just under an hour during the Acmeair app load.

Here is a screenshot from the OCP console showing the Metrics for my jaeger pod with a significant memory growth over a 30 minute timeframe.

image

Steps to Reproduce
On an OpenShift cluster:

The following are instructions that work using MicroProfile with mpTelemetry.

Deploy OpenTelemetry Collector Operator

To deploy OpenTelemetry Collector Operator (optional), utilize the OpenShift console to install operator using defaults for 'Community OpenTelemetry Operator'. The following is a screenshot.

image

After the operator is installed, create instances for OpenTelemetry Instrumentation and OpenTelemetry Collector.

Deploy AcmeAir microservice applications running mpTelemetry-1.1 on Liberty

The 5 AcmeAir microservice repos are located at https://github.com/blueperf/ (choosing the microprofile-4.1 branch)

Follow the instructions from README.md at https://github.com/blueperf/acmeair-mainservice-java to install AcmeAir on OpenShift. To enabled verboseGC for this stress test, you can edit the jvm.options for each of the 5 microservices to include your desired values such as

[XXXX logs]# cat ../acmeair-authservice-java/src/main/liberty/config/jvm.options 
-Dhttp.keepalive=true
-Dhttp.maxConnections=100
-verbose:gc
-Xdump:heap
-Xaggressive
-Xverbosegclog:/logs/verbosegc.%seq.log,5,300000

Deploy Jaeger

To deploy Community Jaeger Operator, utilize the OpenShift console to install operator using default. The following is a screenshot.

image

After the Community Jaeger Operator is installed, use the OpenShift console to create a Jaeger instance. Before creating the Jaeger instance, switch to the 'acmeair' namespace and use 'jaeger-all-in-one-inmemory' for the name.

Run the applications for a long period of time

Use jmeter to drive AcmeAir application load for 24 hours. While the application load is running, access the Jaeger UI using the jaeger route shown in the OpenShift console. Ensure spans are received on Jaeger for the services.

The following are instruction modifications that work using MicroProfile 4.1 with mpTelemetry.

For the 5 microservice Dockerfile's I set the following env info

ENV OTEL_TRACES_EXPORTER=otlp
ENV OTEL_EXPORTER_OTLP_ENDPOINT=http://jaeger-all-in-one-inmemory-collector:4317
ENV OTEL_SERVICE_NAME=XXXservice
ENV OTEL_SDK_DISABLED=false
ENV OTEL_METRICS_EXPORTER=none

Expected behavior
I expect the the jaeger pod memory consumption will at some point plateau and stabilize.

Diagnostic information:

I'd run previous AcmeAir app load on previous builds in October using mpTelemetry-1.1 with microProfile-6.1. While jaeger pod showed some memory growth, it wasn't nearly as significant as this test using mpTelemetry-1.1 with microProfile-6.1 and webProfile-8.0.

hanczaryk commented 1 year ago

After the 24 hour run with MP 4.1 and mpTelmetry-1.1 completed, I started a new run with MP 6.1. Using this daily openliberty build from 11/15, the jaeger pod still shows significant growth but MP 4.1 appears to be about double the growth of this recent MP 6.1 run. Here is a OCP console metrics screenshot from the recent MP 6.1 run.

image

Azquelt commented 1 year ago

How are you deploying Jaeger? In the default configuration, exported spans are stored in memory and in that case we would expect large continual memory growth. I'm not sure if these are the right docs for the Jaeger operator you're using, but if so then we need to use the production deployment type which stores the span data in elasticsearch. (While the default allInOne deployment type is generally used for testing, it's not suitable for a long-running test like this which will generate lots of trace data.)

It is, however, unexpected that you're getting more growth on MP 4.1 vs. MP 6.1. I would suspect that either we're generating more spans, or they have more data in them. I assume the throughput in both configurations is roughly the same? If so then to look further into this, we would need:

Lastly, in a heavily loaded production environment, it would be common to use a sampler to export spans for only a small percentage of requests, rather than for every request. For example, to configure tracing of 1% of requests, you would set the following environment variables:

OTEL_TRACES_SAMPLER=parentbased_traceidratio
OTEL_TRACES_SAMPLER_ARG=0.01

This would result in a much lower rate of span production (and consequently lower memory growth if Jaeger is configured to store spans in memory).

hanczaryk commented 1 year ago

In the issue description, I showed the operations I used to deploy jaeger as detailed in the AcmeAir documentation. I can easily attempt another run using the sampler instructions you pasted above.

I will look at the jaeger production deployment type but I'm currently unaware of how to set that up as I was just following AcmeAir documented instructions. Based on your statements, it sounds like SVT's stress runs should be setup in this manner going forward.