FCP-INDI / C-PAC

Configurable Pipeline for the Analysis of Connectomes
https://fcp-indi.github.io/
GNU Lesser General Public License v3.0
62 stars 40 forks source link

Allow user to specify parallel environment for SGE jobs #435

Open jpellman opened 9 years ago

jpellman commented 9 years ago

Currently, C-PAC generates SGE job files that presuppose the existence of a parallel environment named 'cpac' on the cluster. Users might use a parallel environment with a different name, and we should either allow them to specify this name or have a configuration script create the 'cpac' pe for them.

ccraddock commented 9 years ago

I would prefer the latter. CPAC requires that all of the processors that are made available to a job be on the same node, which requires a special settings in the parallel environment. Since this is not default behavior, and some systems may not be configured in this way, I would rather us use a "special" parallel environment that is configured specifically for CPACs use.

-cc

On Fri, Jan 30, 2015 at 1:40 PM, John Pellman notifications@github.com wrote:

Currently, C-PAC generates SGE job files that presuppose the existence of a parallel environment named 'cpac' on the cluster. Users might use a parallel environment with a different name, and we should either allow them to specify this name or have a configuration script create the 'cpac' pe for them.

— Reply to this email directly or view it on GitHub https://github.com/FCP-INDI/C-PAC/issues/435.

Director of Imaging, Center for the Developing Brain, Child Mind Institute http://childmind.org Research Scientist VI, Nathan S. Kline Institute for Psychiatric Research http://www.rfmh.org/nki/ Co-Founder, The Neuro Bureau http://neurobureau.org Impact Story https://impactstory.org/CameronCraddock Google Scholar Citations http://scholar.google.com/citations?user=lZ9hwSsAAAAJ&hl=en GitHub https://github.com/ccraddock Share that Brain @ the International Neuroimaging Datasharing Initiative http://fcon_1000.projects.nitrc.org/! Join me 2014-10-18 @ Brainhack EDT http://brainhack.org/brainhack-edt/!

ccraddock commented 9 years ago

BTW danny should have something like this already that he uses for AWS.

On Fri, Jan 30, 2015 at 3:06 PM, Cameron Craddock < cameron.craddock@gmail.com> wrote:

I would prefer the latter. CPAC requires that all of the processors that are made available to a job be on the same node, which requires a special settings in the parallel environment. Since this is not default behavior, and some systems may not be configured in this way, I would rather us use a "special" parallel environment that is configured specifically for CPACs use.

-cc

On Fri, Jan 30, 2015 at 1:40 PM, John Pellman notifications@github.com wrote:

Currently, C-PAC generates SGE job files that presuppose the existence of a parallel environment named 'cpac' on the cluster. Users might use a parallel environment with a different name, and we should either allow them to specify this name or have a configuration script create the 'cpac' pe for them.

— Reply to this email directly or view it on GitHub https://github.com/FCP-INDI/C-PAC/issues/435.

Director of Imaging, Center for the Developing Brain, Child Mind Institute http://childmind.org Research Scientist VI, Nathan S. Kline Institute for Psychiatric Research http://www.rfmh.org/nki/ Co-Founder, The Neuro Bureau http://neurobureau.org Impact Story https://impactstory.org/CameronCraddock Google Scholar Citations http://scholar.google.com/citations?user=lZ9hwSsAAAAJ&hl=en GitHub https://github.com/ccraddock Share that Brain @ the International Neuroimaging Datasharing Initiative http://fcon_1000.projects.nitrc.org/! Join me 2014-10-18 @ Brainhack EDT http://brainhack.org/brainhack-edt/!

Director of Imaging, Center for the Developing Brain, Child Mind Institute http://childmind.org Research Scientist VI, Nathan S. Kline Institute for Psychiatric Research http://www.rfmh.org/nki/ Co-Founder, The Neuro Bureau http://neurobureau.org Impact Story https://impactstory.org/CameronCraddock Google Scholar Citations http://scholar.google.com/citations?user=lZ9hwSsAAAAJ&hl=en GitHub https://github.com/ccraddock Share that Brain @ the International Neuroimaging Datasharing Initiative http://fcon_1000.projects.nitrc.org/! Join me 2014-10-18 @ Brainhack EDT http://brainhack.org/brainhack-edt/!

pintohutch commented 9 years ago

I have the scripts to do this. It's relatively simple, and a good idea. The typical parallel environment we recommend for C-PAC users is set up, such that, any one SGE job is guaranteed to never be spread across multiple nodes (in our case, any subject is guaranteed to run all on a single node). That's not to say that the job is the only job running on that node - if the node has 8 cores and we allocate 4 cores per subject, the parallel environment will actually launch two jobs (subjects) to be run on that node.