inpho / vsm

Vector Space Model Framework developed for InPhO
http://inpho.github.io/vsm
Other
35 stars 14 forks source link

Update the multiprocessing versions of Tf and Beagle* to follow pattern of LdaCgsMulti #91

Closed rrose1 closed 9 years ago

JaimieMurdock commented 9 years ago

From Robert:

What I had in mind was a standard pattern for writing vsm model trainers using multiprocessing, particularly as regards management of the data environment.

But I don't think that any more development time should go into this. So yes to your question.

I'm pretty well convinced these days that parallel processing libraries should package robust data management with their processing tools. Multiprocessing (and IPython.processing) doesn't do that beyond a barebones interface to two C types, Array and Value.

hd5py seems much more promising to me in this respect. Although it's still dancing around the GIL and passing around pickleable data... which grievously saps performance gains. So we'll see. When I was last working on this, I was learning OpenMP and had intended just to write the whole trainer in C and Cython (no more GIL).

Googling, I see I wasn't the first to think of that: http://archive.euroscipy.org/talk/6857