Performance issue: Memory leak: Docker container restarting periodically

Nagesh17 commented 3 years ago

Application : Spring boot microservice Deployment env: Docker container running on openstack kubernetes Heap size: 2GB

We are using the asciidoctorj library to covert asciidoc file to pdf file using below logic: Scheduler Thread - runs every 5 minutes

Check if there are any new updates in api-documentation- swagger schema .
If yes, then i. Create new thread for pdf-generation and submit to executorService. ii. The pdf-generation-thread performs 3 tasks- a. Create new adoc file using a json schema - using Swagger2Markup library. ii. Convert this adoc file to pdf - using asciidoctorj library. iii. Shutdown asciidoctor. iii. Get bytes[] of the generated pdf file and store in in-memory concurrent hashmap. - hashmap with max 2/3 keys. iv. Clean up the adoc file and the generated pdf files.
1. Wait for pdf-generation thread to finish its execution.
2. Shutdown executorService.

The heap memory usage keeps on increasing after each run of the pdf-generation thread. The heapdumps of the application show that, there are many live "jruby related" objects. I have attached one heapdump snapshot from jprofiler.

When the heap memory is full the application fails and the docker container is automatically restarted.

Sample codeSnippet:

@Scheduled(fixedDelay = 300000, initialDelay = 25000) public void generatePDFSchedule() { ` // if there are merge events if (!CollectionUtils.isEmpty(mergeEventQueue) ) { mergeEventQueue.clear(); executorService = Executors.newSingleThreadExecutor(); Future future = executorService.submit(() -> generatePDForDomain(domain)); try { future.get(); executorService.shutdown(); } catch (InterruptedException e) { e.printStackTrace(); } catch (ExecutionException e) { e.printStackTrace(); } } } private void generatePDForDomain(String domain) { if (!StringUtils.isEmpty(domain)) { try { generateAdocFiles(domain); generatePDFFile(domain); } } finally { cleanUpFiles(domain); } } } private void generatePDFFile(String domain) throws IOException { Map<String, Object> options = OptionsBuilder.options().inPlace(true).backend("pdf").safe(SafeMode.UNSAFE).asMap(); Asciidoctor asciidoctor=null; try { File adocFile = new File("/tmp/adocFile.adoc")); asciidoctor = Asciidoctor.Factory().create(); asciidoctor.convertFile(adocFile, options); } finally { if (asciidoctor != null) { asciidoctor.shutdown(); } } } `

JRuby-Related-Biggest-Objects-28071932 JRuby-Related-Object-Instances-28071932

abelsromero commented 3 years ago

Quick thought, could be there are no memory limits and the JVM just assumes the whole node resources? Remember that if you are using Java8 you'll also need to set limits to the JVM since it does not detect the ones from k8s.

It would help if you can provide an independent reproducer, the one provided is missing methods and seems to rely on Spring.

robertpanzer commented 3 years ago

The soft references should be removed on a GC, no? Another thing, please reuse the Asciidoctor instance instead of creating a new one on each iteration. Asciidoctor and Asciidoctor-PDF are thread-safe, and initialisation of these components takes relatively a lot of time. So unless you want to render with different extensions in place please use only one Asciidoctor instance.

robertpanzer commented 3 years ago

Also I wouldn't say that the screenshots show anything extraordinary, there is one Ruby instance, which means that previous instances were cleaned up.

robertpanzer commented 3 years ago

Quick thought, could be there are no memory limits and the JVM just assumes the whole node resources?

Yes, very good point. Please run kubectl describe pod .... It should show if the container was OOM-killed. If yes please make sure to limit the max memory size accordingly (not the heap size). At least Java 11 has the option -XX:+UseContainerSupport to do that automatically.

Nagesh17 commented 3 years ago

Quick thought, could be there are no memory limits and the JVM just assumes the whole node resources? Remember that if you are using Java8 you'll also need to set limits to the JVM since it does not detect the ones from k8s.

It would help if you can provide an independent reproducer, the one provided is missing methods and seems to rely on Spring.

@abelsromero We have already set the container resource limit to 2 GB using the resources section in deployment.yaml: spec: template: spec: containers:

name: api-pdf-generator resources: limits: memory: 2048Mi cpu: 400m requests: memory: 2048Mi cpu: 400m

Also, the jvm options -Xmx, -Xms are set to 2GB.

robertpanzer commented 3 years ago

Sure, but you also need to make sure that the JVM will not request more than 2GB, otherwise it will be OOM killed.

Nagesh17 commented 3 years ago

The soft references should be removed on a GC, no? Another thing, please reuse the Asciidoctor instance instead of creating a new one on each iteration. Asciidoctor and Asciidoctor-PDF are thread-safe, and initialisation of these components takes relatively a lot of time. So unless you want to render with different extensions in place please use only one Asciidoctor instance.

@robertpanzer I have already tried this. Created one single instance of asciidoctor in the constructor and re-used it in each run of pdf-generation-thread.. No improvement was seen.

Nagesh17 commented 3 years ago

Also I wouldn't say that the screenshots show anything extraordinary, there is one Ruby instance, which means that previous instances were cleaned up.

@robertpanzer This is the heap dump captured after the pdf-generator-thread finished its execution and no thread was actively in running state.

Nagesh17 commented 3 years ago

Sure, but you also need to make sure that the JVM will not request more than 2GB, otherwise it will be OOM killed.

I have done that. Edited the above comment

robertpanzer commented 3 years ago

And why is the container killed? If there is a memory leak then there should be a large number of no longer required objects on the heap. But I don't see that in your screen dump with only one instance of org.jruby.Ruby or one instance of ThreadContext.

robertpanzer commented 3 years ago

Please provide the output of kubectl describe pod.

asciidoctor / asciidoctorj

Performance issue: Memory leak: Docker container restarting periodically #1049