Closed sany2k8 closed 3 years ago
I was able to increase the heap space using this command
You need to make sure the server is not started. If it is you can kill it from the task manager.
import tika tika.tika.TikaJavaArgs = '-Xmx16g'
if you run the parser now it will run using the above arguement. 16gig is the amount of heap space assigned. You can change that number to a higher number if you need additional heap space.
Make sure you are running Java 64 bit other wise the max heapspace for 32bit is less than 2 gigs.
Thank you @gooseillo !
I am parsing 200mb-500mb pdf file using python-tika jar and it works but when I try with 1.3gb file the tika server not able to do that. As per my tika-server.log investigation I found this error
java.lang.OutOfMemoryError: Java heap space
.So my question is how to set heap/memory space while running tika or anything need to set from python code to increase that size?
What about this environment variable? Will it able to fix that issue to parse large PDF?
TIKA_JAVA_ARGS - set java runtime arguments, e.g, -Xmx4g
I've tried this configuration but no luck