rdicosmo / parmap

Parmap is a minimalistic library allowing to exploit multicore architecture for OCaml programs with minimal modifications.
http://rdicosmo.github.io/parmap/
Other
94 stars 20 forks source link

process initialization and finalization #20

Closed UnixJunkie closed 10 years ago

UnixJunkie commented 10 years ago

For complex things, it would be very handy to be able to register an init function and a finalize function that would be run by each worker process:

This allows, for example, to setup and cleanup per process output files for workers of Array.iteri or List.iteri. Maybe those functions should be called process_setup and process_cleanup, or some better name.

UnixJunkie commented 10 years ago

If I send a pull request for this feature, is there a chance it will be accepted?

rdicosmo commented 10 years ago

Hi Francois, sure... sorry for not being very reactive in this period (paper deadlines :-))

Also, it would be nice to setup a CI connection to Travis, so we can more easily check that a new feature does not break existing functionality

On Fri, Apr 25, 2014 at 10:19:05PM -0700, Francois Berenger wrote:

If I send a pull request for this feature, is there a chance it will be accepted?

— Reply to this email directly or view it on GitHub.*

Roberto Di Cosmo


Professeur En delegation a l'INRIA PPS E-mail: roberto@dicosmo.org Universite Paris Diderot WWW : http://www.dicosmo.org Case 7014 Tel : ++33-(0)1-57 27 92 20 5, Rue Thomas Mann
F-75205 Paris Cedex 13 Identica: http://identi.ca/rdicosmo

FRANCE. Twitter: http://twitter.com/rdicosmo

Attachments: MIME accepted, Word deprecated

http://www.gnu.org/philosophy/no-word-attachments.html

Office location:

Bureau 3020 (3rd floor) Batiment Sophie Germain Avenue de France

Metro Bibliotheque Francois Mitterrand, ligne 14/RER C

GPG fingerprint 2931 20CE 3A5A 5390 98EC 8BFC FCCA C3BE 39CB 12D3

rdicosmo commented 10 years ago

That's a good idea!

You can be implement it as a couple of optional parameters, for example with the names you suggest... which then need to be added to all the different combinators...

On Sun, Apr 20, 2014 at 07:14:13PM -0700, Francois Berenger wrote:

For complex things, it would be very handy to be able to register an init function and a finalize function that would be run by each worker process:

• the init function will be called only once by each child process, just after the process is created • the finalize function will be called only once by a child process just before it exit

This allows, for example, to setup and cleanup per process output files for workers of Array.iteri or List.iteri. Maybe those functions should be called process_setup and process_cleanup, or some better name.

— Reply to this email directly or view it on GitHub.*

Roberto Di Cosmo


Professeur En delegation a l'INRIA PPS E-mail: roberto@dicosmo.org Universite Paris Diderot WWW : http://www.dicosmo.org Case 7014 Tel : ++33-(0)1-57 27 92 20 5, Rue Thomas Mann
F-75205 Paris Cedex 13 Identica: http://identi.ca/rdicosmo

FRANCE. Twitter: http://twitter.com/rdicosmo

Attachments: MIME accepted, Word deprecated

http://www.gnu.org/philosophy/no-word-attachments.html

Office location:

Bureau 3020 (3rd floor) Batiment Sophie Germain Avenue de France

Metro Bibliotheque Francois Mitterrand, ligne 14/RER C

GPG fingerprint 2931 20CE 3A5A 5390 98EC 8BFC FCCA C3BE 39CB 12D3

rdicosmo commented 10 years ago

@UnixJunkie : may you propose an specification for this feature? The ideal way would be a modified parmap.mli with the types you expect. Is it enough for the initialisation and finalisation functions to be of type unit -> unit, for example?

UnixJunkie commented 10 years ago

I guess unit -> unit should be OK.

rdicosmo commented 10 years ago

I just committed a first version of Parmap that adds init and finalize parameters to the parallel combinators.

Notice that init is now of type : int -> unit, and is passed as parameter the number of the core on which the worker is running. The init function defaults to the 'redirect' function that is part of the original API.

The documentation is not updated yet to reflect the change.

I would appreciate feedback and testing of this new feature

On Thu, May 08, 2014 at 05:32:30PM -0700, Francois Berenger wrote:

I guess unit -> unit should be OK.

— Reply to this email directly or view it on GitHub.*

Roberto Di Cosmo


Professeur En delegation a l'INRIA PPS E-mail: roberto@dicosmo.org Universite Paris Diderot WWW : http://www.dicosmo.org Case 7014 Tel : ++33-(0)1-57 27 92 20 5, Rue Thomas Mann
F-75205 Paris Cedex 13 Identica: http://identi.ca/rdicosmo

FRANCE. Twitter: http://twitter.com/rdicosmo

Attachments: MIME accepted, Word deprecated

http://www.gnu.org/philosophy/no-word-attachments.html

Office location:

Bureau 3020 (3rd floor) Batiment Sophie Germain Avenue de France

Metro Bibliotheque Francois Mitterrand, ligne 14/RER C

GPG fingerprint 2931 20CE 3A5A 5390 98EC 8BFC FCCA C3BE 39CB 12D3

UnixJunkie commented 10 years ago

I think this int is not even needed, if the parent process maintains a number of sons and if this number is incremented in the parent process only after each successful fork.

I feel the init function should default to doing nothing because this would mimic the current behavior and also reflect the default finalize function, also doing nothing by default.

rdicosmo commented 10 years ago

Actually, in the current version of Parmap, the initialisation phase calls redirect (in the code, init i just replaced the redirect i call), and redirect is controlled by a boolean value: if set it redirects stdout/stderr, otherwise does nothing.

On Sun, May 11, 2014 at 06:07:59PM -0700, Francois Berenger wrote:

I think this int is not even needed, if the parent process maintains a number of sons and if this number is incremented in the parent process only after each successful fork.

I feel the init function should default to doing nothing because this would mimic the current behavior and also reflect the default finalize function, also doing nothing by default.

— Reply to this email directly or view it on GitHub.*

Roberto Di Cosmo


Professeur En delegation a l'INRIA PPS E-mail: roberto@dicosmo.org Universite Paris Diderot WWW : http://www.dicosmo.org Case 7014 Tel : ++33-(0)1-57 27 92 20 5, Rue Thomas Mann
F-75205 Paris Cedex 13 Identica: http://identi.ca/rdicosmo

FRANCE. Twitter: http://twitter.com/rdicosmo

Attachments: MIME accepted, Word deprecated

http://www.gnu.org/philosophy/no-word-attachments.html

Office location:

Bureau 3020 (3rd floor) Batiment Sophie Germain Avenue de France

Metro Bibliotheque Francois Mitterrand, ligne 14/RER C

GPG fingerprint 2931 20CE 3A5A 5390 98EC 8BFC FCCA C3BE 39CB 12D3

rdicosmo commented 10 years ago

Since init is now available, we can of course change this behaviour, and have no default initialisation at all. In this case, the user that wants to get a redirection needs to explicitly call the redirect function as part of the initialisation. Seems cleaner indeed, I'll give it a try.

We need to have init of type int -> unit anyway, as there is no way of knowing the index of the core on which the process is running otherwise.

2014-05-12 8:46 GMT+02:00 Roberto Di Cosmo roberto@dicosmo.org:

Actually, in the current version of Parmap, the initialisation phase calls redirect (in the code, init i just replaced the redirect i call), and redirect is controlled by a boolean value: if set it redirects stdout/stderr, otherwise does nothing.

On Sun, May 11, 2014 at 06:07:59PM -0700, Francois Berenger wrote:

I think this int is not even needed, if the parent process maintains a number of sons and if this number is incremented in the parent process only after each successful fork.

I feel the init function should default to doing nothing because this would mimic the current behavior and also reflect the default finalize function, also doing nothing by default.

— Reply to this email directly or view it on GitHub.*

Roberto Di Cosmo


Professeur En delegation a l'INRIA PPS E-mail: roberto@dicosmo.org Universite Paris Diderot WWW : http://www.dicosmo.org Case 7014 Tel : ++33-(0)1-57 27 92 20 5, Rue Thomas Mann F-75205 Paris Cedex 13 Identica: http://identi.ca/rdicosmo

FRANCE. Twitter: http://twitter.com/rdicosmo

Attachments: MIME accepted, Word deprecated

http://www.gnu.org/philosophy/no-word-attachments.html

Office location:

Bureau 3020 (3rd floor) Batiment Sophie Germain Avenue de France

Metro Bibliotheque Francois Mitterrand, ligne 14/RER C

GPG fingerprint 2931 20CE 3A5A 5390 98EC 8BFC FCCA C3BE 39CB 12D3

Roberto Di Cosmo


Professeur En delegation a l'INRIA PPS E-mail: roberto@dicosmo.org Universite Paris Diderot WWW : http://www.dicosmo.org Case 7014 Tel : ++33-(0)1-57 27 92 20 5, Rue Thomas Mann F-75205 Paris Cedex 13 Identica: http://identi.ca/rdicosmo

FRANCE. Twitter: http://twitter.com/rdicosmo

Attachments: MIME accepted, Word deprecated

http://www.gnu.org/philosophy/no-word-attachments.html

Office location:

Bureau 320 (3rd floor) Batiment Sophie Germain Avenue de France

Metro Bibliotheque Francois Mitterrand, ligne 14/RER C

GPG fingerprint 2931 20CE 3A5A 5390 98EC 8BFC FCCA C3BE 39CB 12D3

UnixJunkie commented 10 years ago

I looked at the code quickly. Maybe Pervasives.at_exit could have been used instead of explicitly calling finalize before each exit call. That would call the finalize function even in case of an uncaught exception. But I am not sure that's super useful. I will test very soon those new functions and report about my tests, thanks a lot for the implementation.

rdicosmo commented 10 years ago

On Mon, May 12, 2014 at 12:42:09AM -0700, Francois Berenger wrote:

I looked at the code quickly. Maybe Pervasives.at_exit could have been used instead of explicitly calling finalize before each exit call. That would call the finalize function even in case of an uncaught exception.

Right, thanks for the suggestion!

But I am not sure that's super useful. I will test very soon those new functions and report about my tests, thanks a lot for the implementation.

— Reply to this email directly or view it on GitHub.*

Roberto Di Cosmo


Professeur En delegation a l'INRIA PPS E-mail: roberto@dicosmo.org Universite Paris Diderot WWW : http://www.dicosmo.org Case 7014 Tel : ++33-(0)1-57 27 92 20 5, Rue Thomas Mann
F-75205 Paris Cedex 13 Identica: http://identi.ca/rdicosmo

FRANCE. Twitter: http://twitter.com/rdicosmo

Attachments: MIME accepted, Word deprecated

http://www.gnu.org/philosophy/no-word-attachments.html

Office location:

Bureau 3020 (3rd floor) Batiment Sophie Germain Avenue de France

Metro Bibliotheque Francois Mitterrand, ligne 14/RER C

GPG fingerprint 2931 20CE 3A5A 5390 98EC 8BFC FCCA C3BE 39CB 12D3

rdicosmo commented 10 years ago

All these changes are now committed into cb5a4b81ca2578c53ecb592986de7abba33e3bb7 Looking forward for the feedback from field testing.

UnixJunkie commented 10 years ago

It is OK for me. I tried on my computer and checked that the init and finalize functions were called as many times as ncores. I tried from 1 to 8 cores, which is the maximum for my machine. Thanks a lot for implementing this ! I will try later to see if I can exploit those functions in order to reach higher parallelization in some real world application I have but it will take a little more time for me to test that. I'll report about my trials.

rdicosmo commented 10 years ago

Great, and thanks for contributing to the documentation

On Thu, May 15, 2014 at 07:16:42PM -0700, Francois Berenger wrote:

It is OK for me. I tried on my computer and checked that the init and finalize functions were called as many times as ncores. I tried from 1 to 8 cores, which is the maximum for my machine. Thanks a lot for implementing this ! I will try later to see if I can exploit those functions in order to reach higher parallelization in some real world application I have but it will take a little more time for me to test that. I'll report about my trials.

— Reply to this email directly or view it on GitHub.*

Roberto Di Cosmo


Professeur En delegation a l'INRIA PPS E-mail: roberto@dicosmo.org Universite Paris Diderot WWW : http://www.dicosmo.org Case 7014 Tel : ++33-(0)1-57 27 92 20 5, Rue Thomas Mann
F-75205 Paris Cedex 13 Identica: http://identi.ca/rdicosmo

FRANCE. Twitter: http://twitter.com/rdicosmo

Attachments: MIME accepted, Word deprecated

http://www.gnu.org/philosophy/no-word-attachments.html

Office location:

Bureau 3020 (3rd floor) Batiment Sophie Germain Avenue de France

Metro Bibliotheque Francois Mitterrand, ligne 14/RER C

GPG fingerprint 2931 20CE 3A5A 5390 98EC 8BFC FCCA C3BE 39CB 12D3

rdicosmo commented 10 years ago

Closed by aa2270a3c7d827e8b8f1c27238d60b4cdee497c5