louismullie / treat

Natural language processing framework for Ruby.
Other
1.37k stars 128 forks source link

Threading #104

Closed peddinti closed 9 years ago

peddinti commented 9 years ago

Hi,

I am doing some processing over a lot of data. So to speed up the process, i am using Thread pool. However when i am using it with more than 1 thread, i am getting the following error. Error parsing: can't create Java VM

I am suspecting this is because, somehow we cannot create two instances of Java VM's? Is there a way i can get this to work?

louismullie commented 9 years ago

You can't create two instances of Java VMs inside the same thread, at least not when using Rjb. You might get some mileage from using JRuby instead. I've used Treat + RedStorm in the past with good results.

Otherwise, you can use multi-processing (create a pool of independent processes each running Treat) to achieve your desired results using MRI.

peddinti commented 9 years ago

Hi can you provide more information on how to create two instances of Java vms?

louismullie commented 9 years ago

As stated above, this is not possible. You cannot run multiple instances of the Java VM with the regular Ruby-Java Proxy. What you can do is use JRuby, in which case you will be able to leverage the Java VM threads directly (more or less). In that case, there will be no need to create multiple Java VMs to get threading. Again, https://github.com/colinsurprenant/redstorm could also suit your purposes as well.

With regular Ruby (MRI), multi-processing is the only way to achieve what you are looking for. Please see http://stackoverflow.com/questions/56087/does-ruby-have-real-multithreading for a discussion of the different threading models in different flavors of Ruby.

I am closing this for now, as there is no pending issue to resolve with respect to the maintenance of the library. Please let me know if you have any further questions.

peddinti commented 9 years ago

ah i misread can't to can :) thank!