Open cudothanh-Nhan opened 1 week ago
Increasing number of DirectByteBuffer objects.
That doesn't tell us much. And you only give us one data point.
Does your machine have many cores? #4317 and #5671 are about many threads. The screenshot shows details of an EpollEventLoop; we would expect there to be a cache there.
Our app run on machine with 48 cores. I could give you the full heap dump here. https://drive.google.com/file/d/1ycFKIrlkxqTIVYuciAIw0j2RyZR4pupS/view?usp=sharing
Given that we expect there to be a cache for each EpollEventLoop but it seems too much memory. From the heap dump, you can see one event loop contains about 16 DirectByteBuffers in small subpage area, each has capacity of 2MB. Meaning that each event loop occupies about 16 x 2 = 32 MB -> 40MB
Does it sound reasonable? @ejona86
I also wonder whether we have a limitation on the number of DirectByteBuffers inside each subpage area
gRPC reduces the subpage size to 2 MiB, to reduce memory. It also reduces the number of threads to number of cores. I think what's hurting here is the number of threads. If we reduced the number of threads by half, would that get into a reasonable state, or are you hoping for even more memory usage redection?
I mean while the size of each subpage is only 2MB, there is also a potential memory pressure when there are many objects of them. Even though if my server only has 1 cores, one eventloop can contains multiple subpage, 2MB each @ejona86
After diving deep inside netty
implementation, while the number of PoolChunk
object is stable (48 objects for 48 cores), I have found many DirectByteBuffer
objects referenced by PoolThreadCache
(about 1,154 objects as shown in the below image).
Given that my GRPC Client use default grpc executor which is, in turn, a cache thread executor.
Is native memory occupied by DirectByteBuffer
freed after executor thread no longer exist? I think no because I see a lot of DirectByteBuffer
objects are holding in PoolThreadCache
It seems that one PoolThreadCache can contain many DirectByteBuffer
objects, so that if one PoolThreadCache
contain 40 SmallSubPageDirectCaches
, it can consume up to 2MB * 40 = 80 MB native memory.
Am I right?
@ejona86 Hello, is there any progress on this issue?
What version of gRPC-Java are you using?
1.60.0
What is your environment?
jdk-18.0.2.1-x64 Linux 3.10.0-1160.76.1.el7.x86_64
Client intialization?
JVM properties?
/zserver/java/jdk-18.0.2.1/bin/java --add-opens=java.base/jdk.internal.misc=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED -Dio.netty.tryReflectionSetAccessible=true -Dzappname=kiki-asr-streaming-websocket -Dzappprof=production -Dzconfdir=conf -Dzconffiles=config.ini -Djzcommonx.version=LATEST -Dzicachex.version=LATEST -Dzlogconffile=log4j2.yaml -Dlog4j2.configurationFile=conf/production.log4j2.yaml -Dlog4j2.contextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector -Dlog4j2.immediateFlush=false -Djava.net.preferIPv4Stack=true -XX:+AlwaysPreTouch -XX:+UseTLAB -XX:+ResizeTLAB -XX:+PerfDisableSharedMem -Xms1G -Xmx2G -XX:+UseG1GC -XX:MaxGCPauseMillis=500 -XX:InitiatingHeapOccupancyPercent=70 -XX:ParallelGCThreads=24 -XX:ConcGCThreads=24 -XX:+ParallelRefProcEnabled -XX:-ResizePLAB -XX:G1RSetUpdatingPauseTimePercent=5 -Dspring.config.location=optional:file:./conf/production.spring.yaml -Dorg.springframework.boot.logging.LoggingSystem=none -jar /zserver/java-projects/kiki-asr-streaming-websocket/dist/kiki-asr-streaming-websocket-1.3.1.jar
What did you expect to see?
Stable number of DirectByteBuffer objects
What did you see instead?
Increasing number of DirectByteBuffer objects.
This is my OQL to list capacity of about 1,832 objects![image](https://github.com/grpc/grpc-java/assets/55862577/c7d46573-3a31-4647-8ea5-2655421a4ce5)
This is the GC root references from sample FastThreadLocalThread which contains DirectByteBuffer with capacity about 2MB and there are a lot of object like that![image](https://github.com/grpc/grpc-java/assets/55862577/ec6420de-c4fa-442c-8048-ac3f12cb8470)
Besides, I noted that there are many DirectByteBuffer which has null cleaner. Is it the intentional impletation of netty.
Steps to reproduce the bug