balbuenac / flying-saucer

Automatically exported from code.google.com/p/flying-saucer
1 stars 0 forks source link

Improvement: flushing pdf content as it goes #263

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
I'm using flying-saucer library to export html content as pdf manual.
The problem is that for the huge amount of html documents my server fails with 
OOM error (pls. see memory dump analysis attached - 
memory_consumption_5000htmlpages.png). It seems this library firstly "painting" 
whole pdf in memory, and then write it to the file.

So instead of layout whole html content and rendering it after, I updated (and 
tested) ITextRenderer to layout and paint page-by-page 
(https://github.com/Mak-Sym/flyingsaucer/commit/eef611d4dd2f7780478dcfde712dfa23
be44bfd4).

In my fork I also wrapped OutpuStream to track streaming operations, and it 
seems content is streamed to file after adding each page 
(pdf_export_page_by_page.log - I added DocListener to listen onPageAdded events 
and correlate them with streaming the data to the file). But it seems rendered 
page objects still remains in memory 
(memory_consumption_after_each_page_rendering.png - Y axis is heap consumption, 
bytes, X axis is number of the page rendered, GC before each measurement).

Is there a way to make "layouted" and "painted" objects eligible got GC 
somehow, so this library doesn't consumes all the heap memory on huge pdf 
exports?

Thank you in advance,
-Maks.

Original issue reported on code.google.com by Maksym.F...@gmail.com on 2 Jun 2015 at 8:45

Attachments: