open-telemetry / opentelemetry-java-instrumentation

OpenTelemetry auto-instrumentation and instrumentation libraries for Java
https://opentelemetry.io
Apache License 2.0
1.98k stars 865 forks source link

Agent adds large amount of native RAM to app #1294

Closed anuraaga closed 1 year ago

anuraaga commented 4 years ago

@wangzl is looking at memory usage of the Java agent. One interesting thing we found is that enabling the agent for a toy main function causes process memory to go up by 100MB, while heap size doesn't go up. That isn't what I would expect.

@wangzl is it possible to add the code for your function here and startup commands, etc?

wangzlei commented 4 years ago

We notice this issue when adding javaagent:opentelemetry-javaagent-all.jar in AWS lambda, the stats shows memory usage increase around 70MB than before. then I did some test on my Mac and see the same symptom.

App code is pretty simple, sends http request in while loop.

public class DemoAuto {

    public static void main(String[] args) throws Exception {

        while(true)
        {
            httpCall();
            printVM();
            Thread.sleep(1000);
        }
    }

    private static void httpCall() throws Exception {
        CloseableHttpClient httpclient = HttpClients.createDefault();
        try {
            HttpGet httpget = new HttpGet("http://httpbin.org/");

            System.out.println("Executing request " + httpget.getRequestLine());

            // Create a custom response handler
            ResponseHandler<String> responseHandler = new ResponseHandler<String>() {

                @Override
                public String handleResponse(
                        final HttpResponse response) throws ClientProtocolException, IOException {
                    int status = response.getStatusLine().getStatusCode();
                    if (status >= 200 && status < 300) {
                        HttpEntity entity = response.getEntity();
                        return entity != null ? EntityUtils.toString(entity) : null;
                    } else {
                        throw new ClientProtocolException("Unexpected response status: " + status);
                    }
                }

            };
        } finally {
            httpclient.close();
        }
    }

    private static void printVM() {
        int i = (int)Runtime.getRuntime().totalMemory()/1024/1024;
        System.out.println("total memory is "+i);
        int j = (int)Runtime.getRuntime().freeMemory()/1024/1024;
        System.out.println("free memory is "+j);
        System.out.println("max memory is "+Runtime.getRuntime().maxMemory()/1024/1024);
    }

}
  1. Launch app without auto instrumentation, memory usage is 48M
    
    ~/w/OTelDemo/build/libs  java -Xmx32m -Xms32m -jar OTelDemo-1.0-SNAPSHOT.jar
    total memory is 32
    free memory is 11
    max memory is 32

jinfo 43097 VM Flags: -XX:CICompilerCount=3 -XX:ConcGCThreads=1 -XX:G1ConcRefinementThreads=4 -XX:G1HeapRegionSize=1048576 -XX:GCDrainStackTargetSize=64 -XX:InitialHeapSize=33554432 -XX:MarkStackSize=4194304 -XX:MaxHeapSize=33554432 -XX:MaxNewSize=19922944 -XX:MinHeapDeltaBytes=1048576 -XX:NonNMethodCodeHeapSize=5830732 -XX:NonProfiledCodeHeapSize=122913754 -XX:ProfiledCodeHeapSize=122913754 -XX:ReservedCodeCacheSize=251658240 -XX:+SegmentedCodeCache -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseFastUnorderedTimeStamps -XX:+UseG1GC

VM Arguments: jvm_args: -Xmx32m -Xms32m java_command: OTelDemo-1.0-SNAPSHOT.jar java_class_path (initial): OTelDemo-1.0-SNAPSHOT.jar Launcher Type: SUN_STANDARD

PID COMMAND %CPU TIME #TH #WQ #PORTS MEM PURG CMPRS PGRP PPID STATE BOOSTS %CPU_ME %CPU_OTHRS UID FAULTS COW MSGSENT MSGRECV SYSBSD SYSMACH 43097 java 0.1 00:01.68 21 1 100 48M+


2. Launch app with auto instrumentation

~/w/OTelDemo/build/libs  java -Xmx32m -Xms32m -javaagent:opentelemetry-javaagent-all.jar -jar OTelDemo-1.0-SNAPSHOT.jar

jinfo 43501 VM Flags: -XX:CICompilerCount=3 -XX:ConcGCThreads=1 -XX:G1ConcRefinementThreads=4 -XX:G1HeapRegionSize=1048576 -XX:GCDrainStackTargetSize=64 -XX:InitialHeapSize=33554432 -XX:MarkStackSize=4194304 -XX:MaxHeapSize=33554432 -XX:MaxNewSize=19922944 -XX:MinHeapDeltaBytes=1048576 -XX:NonNMethodCodeHeapSize=5830732 -XX:NonProfiledCodeHeapSize=122913754 -XX:ProfiledCodeHeapSize=122913754 -XX:ReservedCodeCacheSize=251658240 -XX:+SegmentedCodeCache -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseFastUnorderedTimeStamps -XX:+UseG1GC

VM Arguments: jvm_args: -Xmx32m -Xms32m -javaagent:opentelemetry-javaagent-all.jar java_command: OTelDemo-1.0-SNAPSHOT.jar java_class_path (initial): OTelDemo-1.0-SNAPSHOT.jar Launcher Type: SUN_STANDARD

PID COMMAND %CPU TIME #TH #WQ #PORTS MEM PURG CMPRS PGRP PPID STATE BOOSTS %CPU_ME %CPU_OTHRS UID FAULTS COW MSGSENT MSGRECV SYSBSD SYSMACH 43501 java 0.6 00:14.16 28 1 119 110M+


The JVM heap usage seems ok.

total memory is 32 free memory is 16 max memory is 32



If don't set Xmx limit, the memory usage of auto instrumentation would be over 180MB, far larger than no auto instrumentation.
iNikem commented 4 years ago

@wangzlei Have you tried Native Memory Tracking to understand what caused this memory increase?

anuraaga commented 4 years ago

Ah I realized this must mostly be the span exporter taking up RAM. We use grpc-netty, and netty by default preallocates a lot of native memory

https://github.com/netty/netty/blob/4.1/buffer/src/main/java/io/netty/buffer/PooledByteBufAllocator.java

I wonder if it changes by specifying -Dio.netty.allocator.type=unpooled

https://github.com/netty/netty/blob/4.1/buffer/src/main/java/io/netty/buffer/ByteBufUtil.java#L77

Do you mind trying that?

wangzlei commented 4 years ago

Adding -Dio.netty.allocator.type=unpooled does not decrease the memory usage. I found the non-heap memory increases a lot when using instrumentation, non-heap used memory keeps over 60 MB comparing with 30 MB in no instrumentation.

Other hints: Metadata space and CodeHeap increase a lot, total classes loaded is 6500, double than no instrumentation.

bowsersenior commented 4 years ago

Could this be related to this other issue (https://github.com/open-telemetry/opentelemetry-java-instrumentation/issues/1547) where agent jar has duplicated classes?

scarytom commented 9 months ago

@trask was this issue "completed" or just closed because it was stale?

I'm interested because we believe we are seeing this "large off-heap memory allocation" issue with the java agent currently.

trask commented 9 months ago

hey @scarytom, ya, it was closed as stale. I'd recommend opening a new issue and you can reference this one if you think there's relevant info here. thx