grpc / grpc-java

The Java gRPC implementation. HTTP/2 based RPC
https://grpc.io/docs/languages/java/
Apache License 2.0
11.46k stars 3.85k forks source link

Netty 4.1.87 to 4.1.94 regressed memory usage #10428

Open ejona86 opened 1 year ago

ejona86 commented 1 year ago

10401 is fixed, but we're seeing ~1.8x increased memory usage by a benchmark that compares proxyless with Envoy (look at Memory_java).

The only interesting thing when it increased was the Netty upgrade. However, it didn't recover when we downgraded to 4.1.93.Final.

We don't yet have a theory on how proxyless was impacted but not grpc+Envoy.

ejona86 commented 1 year ago

Trying to reproduce so I can use custom builds. ~Still not there as it didn't give me memory results.~

gcloud container clusters get-credentials psm-benchmarks-performance --zone us-central1-b --project grpc-testing
(cd ../test-infra && make all-tools)
../test-infra/bin/prepare_prebuilt_workers -l java:c5c37ac51c0bd96fb8514861d45db20f383c5de2 -p gcr.io/grpc-testing/e2etest/prebuilt/$USER -t latest -r ../test-infra/containers/pre_built_workers/
tools/run_tests/performance/loadtest_config.py -l java --client_channels=8 --server_threads=16 --offered_loads 10000 -t ./tools/run_tests/performance/templates/loadtest_template_psm_proxyless_prebuilt_all_languages.yaml -s driver_pool=drivers-ci -s driver_image= -s client_pool=workers-c2-8core-ci -s server_pool=workers-c2-8core-ci -s big_query_table= -s timeout_seconds=900 -s prebuilt_image_prefix=gcr.io/grpc-testing/e2etest/prebuilt/${USER} -s prebuilt_image_tag=latest -s psm_image_prefix=gcr.io/grpc-testing/e2etest/runtime -s psm_image_tag=v1.5.2 --category=psm --allow_client_language=java --allow_server_language=java -o proxyless.yaml
../test-infra/bin/runner -i proxyless.yaml -annotation-key queue -polling-interval 5s -delete-successful-tests -c :1 -o runner/sponge_log.xml
# if the run fails, need to delete it:
kubectl delete loadtests.e2etest.grpc.io $USER-java-protobuf-async-unary-5000rpcs-1kb-psm-8channels-16threads-10000load

The driver was updated, so now it includes the memory in the results:

            "main": {
                "cpuSeconds": 8.922577823000001,
                "memoryMean": 1068322816.0
            }