bacpop / PopPUNK

PopPUNK 👨‍🎤 (POPulation Partitioning Using Nucleotide Kmers)
https://www.bacpop.org/poppunk
Apache License 2.0
89 stars 18 forks source link

Python version 3.8 and above - changes to multiprocessing #75

Closed nickjcroucher closed 4 years ago

nickjcroucher commented 4 years ago

Default for multiprocessing on Mac OSX changes to spawn from version 3.8: https://docs.python.org/3/library/multiprocessing.html. Spawn also works on Unix and Windows. We should update the code to using spawn and shared_memory wherever multiprocessing is employed. Seems to be mash.py, sketchlib.py, refine.py and utils.py. The lineage_clustering.py file on the relevant branch (e.g. f9645039f303100a3987fa845ec47f2e605914fb) already uses spawn.

nickjcroucher commented 4 years ago

Replaced sharedmem with multiprocessing in refine.py in 1fef55f805348427e03b3c5c2269505dfe38dea3 in branch https://github.com/johnlees/PopPUNK/tree/py38.

nickjcroucher commented 4 years ago

Still a problem in mash.py at least unfortunately. Here, distMat is read/write within a parallelised process, so I am not sure if multiprocessing.shared_memory will work?

johnlees commented 4 years ago

76 includes my attempt at mash.py on top of your changes, will see if I can get it to pass the tests

nickjcroucher commented 4 years ago

Looks like the distMat is returned as all zeroes following the fitKmerBlock pool run.

johnlees commented 4 years ago

Right, forgot to copy it back at the end. Should be fixed in 09a5c4ef7354c878bba69342abc0bfa647bad9bc

nickjcroucher commented 4 years ago

Afriad not, even distances_shared is/was returning zeros when printed - but printing within fitKmerBlock suggests the values in there are being edited. Just not being returned to the parent function.

johnlees commented 4 years ago

Right, it's at least getting to that test which is failing (so they're not totally useless!). Not sure why this is – by my reading of the sharedmem docs was that writing should be fine too. Was your understanding that this would be a problem?

nickjcroucher commented 4 years ago

I had a vague memory that it was read-only, but I am pretty sure that is wrong. Though it is strange that trying to access the buffer (shm_distMat.buf) directly after the pool.map call results in a segfault 11.

nickjcroucher commented 4 years ago

Abandoned memory manager for sharing of distMat and it now seems to be writeable in ddfd0e2eef2cadf1e925e938babcd0ccccdef976. I am not sure I trust the shared memory managers.

johnlees commented 4 years ago

I did a bit of investigation. memory manager is fine, it's basically just a wrapper around a SharedMemory that doesn't let you name the object (but you can still get the name out of the object). I still like it as it does the deleting for you which feels more pythonic.

The problem was what was being passed to and from the functions in my changes. If you just pass an array it will still be copied when the function is called. This works fine for reading, though doesn't at all achieve any of the purpose of shared memory, and explains why writes don't work as it's not passed by reference. So your original method was correct, and I had misunderstood the docs (which do only show passing of a sharedmemory object, not an array).

I've slightly rewritten it to pass a tuple instead of each sharedmemory parameter, and also correct my error in refine.py

nickjcroucher commented 4 years ago

Great, thanks - my attitude to the memory managers was purely emotional at this point. Latest commit to master works for my local tests now.