Closed GoogleCodeExporter closed 9 years ago
Hi, the patch should apply to pdsh-2.22, and after ./bootstrap, it seems to
work.
[c3-slaba@beda-s1 pdsh-2.22]$ patch -p1 < ../with-torque.patch
patching file config/ac_torque.m4
patching file configure.ac
patching file doc/pdsh.1.in
patching file pdsh.spec
patching file src/modules/Makefile.am
patching file src/modules/torque.c
I've done some initial testing on el4+torque-2.1 and el5+torque2.3. Pdsh-rpms
were created with "rpmbuild --with torque ...".
There are a few question marks that I can come to think of.
* In Makefile.am, would it perhaps be more appropriate to explicitly add a
AM_CPPFLAGS at the beginning of the file, and then change the "AM_CPPFLAGS =",
into "AM_CPPFLAGS +="?
* mod-slurm and mod-torque share the -j-option. Should there perhaps be some
mechanism to make mod-slurm and mod-torque mutually exclusive, or at least make
the behavior deterministic if you are using slurm and torque simultaneously?
Now the modules have the same priority.
* I dropped the "-j all" special option in mod-torque. Probably because I
didn't consider it useful to me.
* With the current version, jobids picked up from the PBS_JOBID environment
variable are never sanitized, but rather just sent directly to libtorque.
* Queued jobs, not yet running (i.e., there are no reserved compute nodes) are
ignored silently.
Original comment by don.fanucci
on 10 Sep 2010 at 7:18
Attachments:
Oh, and torque.c:482 should be converted from a fprintf(stderr into an errx, I
guess.
Original comment by don.fanucci
on 10 Sep 2010 at 7:27
Thanks, I will take a look at your patch today.
In reference to your queries:
1. I'll check out AM_CPPFLAGS in Makefile.am. I haven't looked at that file in a long time.
2. Whenever any two modules supply the same option to pdsh,
the first one loaded "wins" and the second fails.
As of pdsh-2.22, modules are loaded in strcmp() order,
so the 'slurm' module will always be loaded if both the slurm
and torque modules are present. As of pdsh-2.21, there is now
a '-M module' option that will force load one module over the
other, so you would have to do 'pdsh -M torque -j JOBID ...'
Ideally, you would only install the slurm module on slurm clusters
and the torque/pbs module on Torque clusters... but in the future
pdsh will hopefully have support for a config file that could automatically
determin which module to load.
3. (-j all option dropped) That is fine. I added -j all to the slurm module
because there were many cases where it was useful, e.g. running a command
on all nodes running slurm jobs looking for a particular problem, etc.
If someone requests -j all, I'm sure it will be easy to add.
Your last two comments from Comment 1 seem fine to me...
Original comment by mark.gro...@gmail.com
on 10 Sep 2010 at 8:38
> In Makefile.am, would it perhaps be more appropriate
> to explicitly add a AM_CPPFLAGS at the beginning of the file, and then change
the
> "AM_CPPFLAGS =", into "AM_CPPFLAGS +="?
Yes, after looking at this I'm not sure why AM_CPPFLAGS wasn't used in the way
you
suggest. However, I think perhaps the better solution is to use per-target
CPPFLAGS
which I think are supported via target_CPPFALGS, so e.g.
torque_la_CPPFLAGS = $(TORQUE_CPPFLAGS)
mark
Original comment by mark.gro...@gmail.com
on 10 Sep 2010 at 10:17
Ok, I've come up with the following 4 extra (minor) patches that apply on top
of your torque patch. Please give them a review and if they look ok to you, I'll
push your torque module into the trunk (with these patches).
Original comment by mark.gro...@gmail.com
on 10 Sep 2010 at 10:29
Attachments:
The four patches looks good! (I also applied them, built a new set of RPMs, and
did a little bit of testing).
Have a nice weekend! :)
/Mattias
Original comment by don.fanucci
on 11 Sep 2010 at 8:11
This issue was closed by revision r1231.
Original comment by mark.gro...@gmail.com
on 13 Sep 2010 at 5:35
Original issue reported on code.google.com by
mark.gro...@gmail.com
on 10 Sep 2010 at 4:32