Open tinganle opened 4 years ago
600 MB isn't a lot of memory. You'll probably need to increase that.
Just to be sure, add a call to api.deallocate()
right after api.End()
. Let me know if that doesn't fix anything though.
600 MB isn't a lot of memory. You'll probably need to increase that.
Yes. 600MB was because I tried to limit heap size to 300M on my local machine to reproduce the problem faster. When we ran the same process on Linux with more memory, we saw it threw the same error around 8G.
Also we we monitored the Java Heap size and the JVM process memory size, there's a huge difference. On my local with heap size max to 300M, the JAVA heap size stayed below 250M, but the JVM process could use 2G memory. We do parallel OCR processing, for each thread, there's only one image file being OCR-ed at a given time. If the memory is cleaned up properly, ideally the memory usage shouldn't grow.
I will try api.deallocate(). Any other ideas are much appreciated. Thanks!
Hi @saudet ,
When I debug, both output and image have null deallocator at the following statements. Calling output.deallocate() doesn't seem doing anything. Is this the desired behavior?
output.deallocate();
pixDestroy(image);
Thanks.
Yes, those are just pointers returned from native functions, so JavaCPP doesn't know how to deallocate them.
Update: calling TessDeleteText(output) after each OCR greatly helped the memory issue (not fully resolved yet).
java.lang.OutOfMemoryError: Physical memory usage is too high: physicalBytes (665M) > maxPhysicalBytes (600M) at org.bytedeco.javacpp.Pointer.deallocator(Pointer.java:589) at org.bytedeco.javacpp.Pointer.init(Pointer.java:125) at org.bytedeco.tesseract.TessBaseAPI.allocate(Native Method) at org.bytedeco.tesseract.TessBaseAPI.(TessBaseAPI.java:35)
If I set the heap size bigger, it will run into this error eventually. We follow the basic example. We create a new instance of BytedecoOcrAPI and call init() for each 'document' which consists of multiple image files that we call doOcr() for each image file.
public class BytedecoOcrAPI implements OcrAPI {
}