Closed celliern closed 7 years ago
Hello!
The join / close is contained in the context manager usage of the Pool (or other executor). Using
with mp.Pool() as p:
p.map(function, iterable)
is roughly equivalent to
try:
p = mp.Pool()
p.map(function, iterable)
finally:
p.join()
p.close()
It is even better than your solution with pmap, because it ensure that the pool is closed even if an exception occur between the creation and the closing of the pool. It is the same as using with open(filename) as f
instead of opening and explicitely closing the file.
See that blog post for better understanding of context managers.
For the notion of executor, it's because a lot of lib try to have the same API as the concurrent.futures
builtin library. In that case, an executor is defined as an object with a start and a close method, which can be used as a context manager and give access to the map method (among others). It will have been more explicite if I had set
executor=concurrent.futures.ProcessPoolExecutor
as default maybe?
Oh, I think about it : the Pool as context manager is a py3 stuff. But I have an implementation working with py2 (it's a very simple context manager...) if you want to keep the py2 compatibility.
Regards,
Nicolas Cellier
Okey, I have ensured the compatibility py2 / py3, and I have add a tox config with some basic test in order to be sure.
I have also made the resfile non-mendatory, as it can be useful to just return the parameters.
(Do not worry about the closing/reopening, little manipulation issue)
As I was saying earlier, I like your idea about specifying external executor. Also, I like your proposition about saving to a text file being an optional thing. Those are great points and thanks for pointing them out.
However:
1) Added lines 6-20 in blackbox.py look kind of ugly. I guess they provide py2 compatibility for the line "with executor() as e:" (btw, are you sure this issue exists at all?). Can we find more elegant solution, that involves less code and less additional libraries? I would say that what I proposed (to keep pmap() and to use it as default executor) looks better/simpler. But I think you have more knowledge on this so maybe we can find something even better. I feel like there should be a "one line of code" solution.
2) This is very simple code and I don't think there is a need to add all those extra files (which I'm sure are critical for a large project). The test file looks just like an example that I have in README, so there is no value in duplicating it. Requirements (numpy, scipy) are very standard computational libraries, I think no need to even mention them. gitignore and tox files don't seem important as well.
I will think about make that more simple. But yes, mp.Pool can not be used as a context manager. But I may try something more elegant. (if we drop py2 support, we can avoid that... But I feel we need to support python 2 as this is a small utility script).
We can remove the extra files, I used them in order to ensure the compatibility across the version. But I think that this tool is really useful and should be available in pypi. (you can see path.py for an other simple one file lib which is tested and pipable). But this is clearly up to you :) if pipable, it need more maintenance effort but the user can using it with a simple pip install blackbox, which is nice.
Thanks for the different comments. Tell me if you want that I delete extra file. I will try shorter solution for the executor. (BTW, all extra imports are built in! It do not add dependencies to blackbox :))
Envoyé par TypeApp
Le 28 août 2017 14:26, à 14:26, Paul Knysh notifications@github.com a écrit:
As I was saying earlier, I like your idea about specifying external executor. Also, I like your proposition about saving to a text file being an optional thing. Those are great points and thanks for pointing them out.
However:
1) Added lines 6-20 in blackbox.py look kind of ugly. I guess they provide py2 compatibility for the line "with executor() as e:" (btw, are you sure this issue exists at all?). Can we find more elegant solution, that involves less code and less additional libraries? I would say that what I proposed (to keep pmap() and to use it as default executor) looks better/simpler. But I think you have more knowledge on this so maybe we can find something even better. I feel like there should be a "one line of code" solution.
2) This is very simple code and I don't think there is a need to add all those extra files (which I'm sure are critical for a large project). The test file looks just like an example that I have in README, so there is no value in duplicating it. Requirements (numpy, scipy) are very standard computational libraries, I think no need to even mention them. gitignore and tox files don't seem important as well.
-- You are receiving this because you modified the open/close state. Reply to this email directly or view it on GitHub: https://github.com/paulknysh/blackbox/pull/5#issuecomment-325339465
Hey!
I have removed the extra files (after checking compatibility) and I have delegated the "ugly" lines inside a function, making these lines more explicit in a way?
Tell me if it suits you.
Nicolas
Yes, putting it all in a separate function sounds like a better solution for now, thanks! In the future I hope we can find a better one.
Did you test the code on both py2 and py3?
yeah! I kept the auxiliary file in local, so I manage to run the same toy test optimization with only scipy and numpy for py2 and py3
2017-08-28 19:29 GMT+02:00 Paul Knysh notifications@github.com:
Yes, putting it all in a separate function sounds like a better solution for now, thanks! In the future I hope we can find a better one.
Did you test the code on both py2 and py3?
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/paulknysh/blackbox/pull/5#issuecomment-325420185, or mute the thread https://github.com/notifications/unsubscribe-auth/ACjzFJNPUl7OChqqnMesnxTC5ZCwVZ2mks5scvjxgaJpZM4PDqEZ .
Cool! I'm going to double check everything and merge it soon.
Also, please add a description for a get_default_executor() function (overview + what it returns)
Lines 110-113 have wrong indentation, they should be aligned with the following if and return statements. Please fix.
Also, do we need lines 43, 44? Can we just set default value of executor "executor=get_default_executor" (line 9)?
Done and done :)
OK, I'm going to polish it further, but I'm merging it for now. Thanks again for your contribution! Feel free to close an issue that you opened some time ago (or maybe only I can do that?).
People say there is an issue with running current code on Windows. There is an error like that:
File "C:\ProgramData\Anaconda2\lib\multiprocessing\forking.py", line 258, in init RuntimeError: Attempt to start a new process before the current process has finished its bootstrapping phase. This probably means that you are on Windows and you have forgotten to use the proper idiom in the main module: if name == 'main': freeze_support() ... The "freeze_support()" line can be omitted if the program is not going to be frozen to produce a Windows executable. self._repopulate_pool()
Can you please take a look? The previous code (without executor) worked OK, so probably there is an issue in get_default_executor() function etc. I'll need to go back to previous version in case we won't find a problem.
Btw, have you tested your code on Win at all? I know it works good in Linux.
I will see it tomorrow. Proposition : rolling back to ensure full compatibility and making a new branch, merging only when tested on windows.
Envoyé par TypeApp
Le 3 sept. 2017 20:26, à 20:26, Paul Knysh notifications@github.com a écrit:
People say there is an issue with running current code on Windows. There is an error like that:
File "C:\ProgramData\Anaconda2\lib\multiprocessing\forking.py", line 258, in init RuntimeError: Attempt to start a new process before the current process has finished its bootstrapping phase.
This probably means that you are on Windows and you have forgotten to use the proper idiom in the main module: if __name__ == '__main__': freeze_support() ... The "freeze_support()" line can be omitted if the program is not going to be frozen to produce a Windows executable.
self._repopulate_pool()
Can you please take a look? The previous code (without executor) worked OK, so probably there is an issue in get_default_executor() function etc. I'll need to go back to previous version in case we won't find a problem.
Btw, have you tested your code on Win at all? I know it works good in Linux.
-- You are receiving this because you modified the open/close state. Reply to this email directly or view it on GitHub: https://github.com/paulknysh/blackbox/pull/5#issuecomment-326822236
Sounds good. I'm not a good git user yet, not sure how to roll back and create a new branch.
Oh, ok, easy fix. I'm calling mp.Pool() in the function, and in windows the Pool have to be called in the
if __name__="__main"
I will explicitely test the py version. Safer.
I didn't get what is the problem exactly. I think mp.Pool() was always called inside of blackbox.py, and never after "name == 'main':" (in the main file).
Anyways, feel free to update the code once you're sure about it. I think you have more expertise on parallelism.
Yes, but the blackbox search is called in the ifmain. But get_default executor is not. I open a new Pull Request for the fix. I have tested for py2 and py3 on windows.
One more thing. Is this necessary to have this warning?:
warnings.warn("running on python2 ...
What is the purpose of it? I think it's just confusing.
There is two way. The good practices tell us to log way more info and mute that, the user choose if it wants to display them. Or we can remove them. It's up to you.
2017-09-05 5:45 GMT+02:00 Paul Knysh notifications@github.com:
One more thing. Is this necessary to have this warning?:
warnings.warn("running on python2 ...
What is the purpose of it? I think it's just confusing.
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/paulknysh/blackbox/pull/5#issuecomment-327064032, or mute the thread https://github.com/notifications/unsubscribe-auth/ACjzFPCf_HxiTWscssaHffhr3_euTvLNks5sfMPmgaJpZM4PDqEZ .
OK. I decided to remove it.
New feature : hability to provide a custom executor (an object that behave like the mp.Pool, with a context manager behaviour and a map method.)
example using dask.distributed tool :
That snippet launch the computation on the different dask workers whose can be on different computers on the network.