modesty / pdf2json

converts binary PDF to JSON and text, for server-side PDF processing and command-line use.
https://github.com/modesty/pdf2json
Other
1.97k stars 376 forks source link

how to free memory used by pdf2json? #113

Open zihuyishi opened 7 years ago

zihuyishi commented 7 years ago

I have a server that run four process to handler client side pdf. I found that when client side send a 10mb pdf file, one process will used 400mb memory and never free it. When run it for long time and receive some big pdf, it will take me about 4gb memory, and make other app down. I have a test code like this

const pdfParser = new PDFParser(this, 1);

pdfParser.on('pdfParser_dataReady', pdfData => {
    console.log('--------------raw-----------------');
    console.log(pdfParser.getRawTextContent());
});

pdfParser.loadPDF('/Users/saye/Downloads/Gradle_Recipes_for_Android.pdf');
let server = http.createServer((req, res) => {
    res.write('hello');
    res.end();
});
server.listen(2345);

so what should I do to make memory free

ldenoue commented 7 years ago

I created a new pdftojson based on xpdf in c. Perhaps it will work better for you? https://github.com/ldenoue/pdftojson

On Feb 20, 2017, at 3:24 AM, saye notifications@github.com wrote:

I have a server that run four process to handler client side pdf. I found that when client side send a 10mb pdf file, one process will used 400mb memory and never free it. When run it for long time and receive some big pdf, it will take me about 4gb memory, and make other app down. I have a test code like this

const pdfParser = new PDFParser(this, 1);

pdfParser.on('pdfParser_dataReady', pdfData => { console.log('--------------raw-----------------'); console.log(pdfParser.getRawTextContent()); });

pdfParser.loadPDF('/Users/saye/Downloads/Gradle_Recipes_for_Android.pdf'); let server = http.createServer((req, res) => { res.write('hello'); res.end(); }); server.listen(2345); so what should I do to make memory free

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

zihuyishi commented 7 years ago

@ldenoue thank you, I will try it.

BigWolf286 commented 4 years ago

you can try pdfParser.destroy() to free the memory