Closed q0j0p closed 4 years ago
I have 6 workers in my multiprocessing pool:
(env1) ubuntu@ip-172-31-31-94:~$ sudo lsof -n -i :9998
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 25545 ubuntu 19u IPv6 2011453 0t0 TCP 127.0.0.1:9998 (LISTEN)
java 25545 ubuntu 40u IPv6 2043175 0t0 TCP 127.0.0.1:9998->127.0.0.1:53224 (ESTABLISHED)
java 25545 ubuntu 63u IPv6 2043156 0t0 TCP 127.0.0.1:9998->127.0.0.1:53204 (ESTABLISHED)
java 25545 ubuntu 137u IPv6 2043144 0t0 TCP 127.0.0.1:9998->127.0.0.1:53192 (ESTABLISHED)
java 25545 ubuntu 152u IPv6 2043168 0t0 TCP 127.0.0.1:9998->127.0.0.1:53216 (ESTABLISHED)
java 25545 ubuntu 155u IPv6 2043202 0t0 TCP 127.0.0.1:9998->127.0.0.1:53232 (ESTABLISHED)
java 25545 ubuntu 158u IPv6 2043159 0t0 TCP 127.0.0.1:9998->127.0.0.1:53208 (ESTABLISHED)
python 27693 ubuntu 20u IPv4 2043155 0t0 TCP 127.0.0.1:53204->127.0.0.1:9998 (ESTABLISHED)
python 27694 ubuntu 20u IPv4 2043189 0t0 TCP 127.0.0.1:53232->127.0.0.1:9998 (ESTABLISHED)
python 27695 ubuntu 20u IPv4 2043176 0t0 TCP 127.0.0.1:53224->127.0.0.1:9998 (ESTABLISHED)
python 27696 ubuntu 20u IPv4 2043160 0t0 TCP 127.0.0.1:53208->127.0.0.1:9998 (ESTABLISHED)
python 27697 ubuntu 20u IPv4 2043143 0t0 TCP 127.0.0.1:53192->127.0.0.1:9998 (ESTABLISHED)
python 27698 ubuntu 20u IPv4 2043167 0t0 TCP 127.0.0.1:53216->127.0.0.1:9998 (ESTABLISHED)
It looks like each worker boots its own jvm (tika server), but they need to have unique endpoints. I'll see if I can iterate an initialization routine for the workers.
this is great! If you get this working please contribute back. Each worker booting its own JVM is fine, up to a point. A common practice...
if you get time for a PR please contribute it back.
@chrismattmann - currently, I'm looking for mutiprocessing. Is this implemented with current release? if yes, how do I invoke it?
I was wondering if tika can be used with multiprocessing (in my case to scale up pdf text extraction)? Would this involve starting multiple jvms explicitly? I'd be interested in adding this functionality given some guidance. Thanks.