Open GoogleCodeExporter opened 9 years ago
Good catch. However, I do not have a Torque system on which to test.
The original work was done by someone else.
However, looking at the code it does seem it assumes all jobids will be
convertible
to numbers, so that step in the code is failing.
If you can point me to a reference for querying nodes in array jobs from the C
API,
I could take a stab at providing you a patch for testing. Or, if you are feeling
adventurous, you could propose a patch yourself.
Thanks
mark
Original comment by mark.gro...@gmail.com
on 21 May 2014 at 3:22
Hi,
I would be interested in this as well. I found the torque source code here: https://github.com/adaptivecomputing/torque
Usually jobids are in the for of 12345 but in arrays an index is added, so it looks like 12345[1], 12345[2], ...
Obviously the code exits when it tries to convert the string "jobid" to long int "jid" with "strtoul". So preserving the array index or use char instead of int and just regex for [0-9[]] should work, the rest is actually the same.
Best, Michael
What if you just take out the code that verifies the jobid, and let Torque do the verification?
As I said, I don't personally have access to a Torque/PBS system on which to test, and I'm not even the author of the original code, but perhaps you could test something like this?
Feel free to open a PR if you find something that works, or have other fixes. Thanks!
diff --git a/src/modules/torque.c b/src/modules/torque.c
index 4d7f6e3..2ceacd1 100644
--- a/src/modules/torque.c
+++ b/src/modules/torque.c
@@ -124,23 +124,6 @@ static int mod_torque_init (void)
}
-static int32_t str2jobid (const char *str)
-{
- char *p = NULL;
- long int jid;
-
- if (str == NULL)
- return (-1);
-
- jid = strtoul (str, &p, 10);
-
- if( *p != '\0' )
- errx ("%p: invalid jobid format \"%s\" for -j\n", str);
-
- return ((int32_t) jid);
-}
-
-
static int
torque_process_opt(opt_t *pdsh_opts, int opt, char *arg)
{
@@ -180,14 +163,8 @@ static int mod_torque_wcoll(opt_t *opt)
return 0;
}
-static void _create_fq_jobid(char *dst, const char *jobid, const char *serverna
- /*
- * Create fully qualified jobid, "<integer>.<servername>"
- */
- if ( str2jobid(jobid) < 0 ){
- *dst = '\0';
- return;
- }
+static void _create_fq_jobid(char *dst, const char *jobid, const char *serverna
+{
strncpy(dst, jobid, PBS_MAXSEQNUM);
strncat(dst, ".", 1);
strncat(dst, servername, PBS_MAXSERVERNAME);
Obviously the easiest way :)
pdsh -j 35985 hostname
n3551: n3551.tld
pdsh -j 35984[1] hostname
n3552: n3552.tld
Thanks a lot!
Complete diff (not truncated):
diff --git a/src/modules/torque.c b/src/modules/torque.c
index 4d7f6e3..8f56af0 100644
--- a/src/modules/torque.c
+++ b/src/modules/torque.c
@@ -124,23 +124,6 @@ static int mod_torque_init (void)
}
-static int32_t str2jobid (const char *str)
-{
- char *p = NULL;
- long int jid;
-
- if (str == NULL)
- return (-1);
-
- jid = strtoul (str, &p, 10);
-
- if( *p != '\0' )
- errx ("%p: invalid jobid format \"%s\" for -j\n", str);
-
- return ((int32_t) jid);
-}
-
-
static int
torque_process_opt(opt_t *pdsh_opts, int opt, char *arg)
{
@@ -180,14 +163,11 @@ static int mod_torque_wcoll(opt_t *opt)
return 0;
}
-static void _create_fq_jobid(char *dst, const char *jobid, const char *servername){
+static void _create_fq_jobid(char *dst, const char *jobid, const char *servername)
+{
/*
* Create fully qualified jobid, "<integer>.<servername>"
*/
- if ( str2jobid(jobid) < 0 ){
- *dst = '\0';
- return;
- }
strncpy(dst, jobid, PBS_MAXSEQNUM);
strncat(dst, ".", 1);
strncat(dst, servername, PBS_MAXSERVERNAME);
For completion here errors for malformed IDs:
pdsh@login1: Failed to retrieve information for job 123234.tld: (15001) Unknown Job Id Error
pdsh@login1: no remote hosts specified
pdsh@login1: Failed to retrieve information for job a123234.tld: (15020) Unknown queue
pdsh@login1: no remote hosts specified
pdsh@login1: Failed to retrieve information for job 123234a.tld: (15001) Unknown Job Id Error
pdsh@login1: no remote hosts specified
pdsh@login1: Failed to retrieve information for job 1a2b3b2b.tld: (15001) Unknown Job Id Error
Corresponding Torque jobs stats:
qstat: Unknown Job Id Error 123234.tld
qstat: illegally formed job identifier: 123234d
qstat: Error (1 - Operation not permitted)
qstat: Unknown queue destination a123234
This seems OK. Works with Moab numeric IDs and Moab job arrays as well.
Great, thanks for testing. Just to verify, should I throw this commit into master?
I would say yes. It's less broken than before AFAICT. And if the module gets a wrong ID Torque just complains. If it is OK it gets the Job stats correctly. I think there's no need to check if the ID is formatted correctly.
Thanks a lot again. Best regards, Michael
Thank you for testing!
On Wed, Aug 17, 2016 at 2:44 PM, Michael notifications@github.com wrote:
I would say yes. It's less broken than before AFAICT. And if the module gets a wrong ID Torque just complains. If it is OK it gets the Job stats correctly. I think there's no need to check if the ID is formatted correctly.
Thanks a lot again. Best regards, Michael
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/grondo/pdsh/issues/64#issuecomment-240558801, or mute the thread https://github.com/notifications/unsubscribe-auth/AAtSUijGxsxf8ROQ1pcevgww1LZ7Jr_Rks5qg4C-gaJpZM4Jm1pO .
Original issue reported on code.google.com by
msbr...@umich.edu
on 21 May 2014 at 3:16