chrismattmann / tika-python

Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
Apache License 2.0
1.51k stars 234 forks source link

how much assigned memory for JVM when tika parse the file using from_buffer method? #270

Closed sunweiconfidence closed 4 years ago

sunweiconfidence commented 4 years ago

@chrismattmann i want to know how much memory to assign to JVM when tika-python to call from_buffer method ? because when i have several files call parser.from_buffer in the same time for some large file, it will give me 500 status error, but when i call parser.from_buffer alone in one time, it can parser file successfully, i use tika-server.jar version is 1.22, could i adjust jvm assigned memory for high concurrency situation? thanks

chrismattmann commented 4 years ago

This will vary with your use case @sunweiconfidence but I typically do ~4Gb RAM...

sunweiconfidence commented 4 years ago

@chrismattmann if i want to increase the JVM size to greater than 4Gb RAM, how do i change the code to adjust it? thanks