Closed mattwthompson closed 3 years ago
I agree with you that the use of multiprocessing should be default, probably not a need for having an additional argument for this.
I'm getting this error currently trying to run the parallel implementation
p = pool.Process(target=worker, args=(partial_dict, d,))
TypeError: Process() missing 1 required positional argument: 'ctx'
Not sure why but the parallel implementation is working for me now and is faster compared to the serial implementation. Are you okay with merging this?
Feel free to merge if you think it's fine. The issues you ran into earlier are curious, but if this works and would help your work, go for it
I get the above error on Python 3.8 but not on python 3.7
Seems to stem from a difference in how 3.8 handless processes: https://chrissardegna.com/blog/multiprocessing-changes-python-3-8/.
My approach to fix this is to check for the Python version with sys.version_info
and call pool.Process()
accordingly.
Another thing to fix is the progress bar. The specific package doesn't work well with multiprocessing so I need to look into alternatives.
The more I think about this, the more I like the option of having a parallel
argument, at least for testing purposes.
Implements #19
Actually some of the first decent-looking parallel Python code I've written
I considered a
parallel
boolean option but .... I'm not sure I see a downside to this being default? This uses what I believe are very stable functions, i.e.multiprocessing.cpu_count()
won't brick your machine