Altiscale / pdsh

Automatically exported from code.google.com/p/pdsh
GNU General Public License v2.0
0 stars 0 forks source link

pdsh 2.24 - pdcp executes file instead of copying #12

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Perhaps i am doing something wrong but my 2.23 version works fine.  When i 
compile pdsh and execute a pdcp it executes the file instead of copying it.  It 
looks as though bash is attempting to execute it.  This is on RHEL 5.5 64bit.

Compile options:

./configure --with-dshgroups --with-readline --with-machines --with-ssh

[root@galaxy bin]# pdcp -w galaxy /tmp/me /var/tmp/me
galaxy: bash: /tmp/me: Permission denied
pdcp@galaxy: galaxy: ssh exited with exit code 126
[root@galaxy bin]#

Thanks,

Sean

Original issue reported on code.google.com by iffla...@gmail.com on 4 Mar 2011 at 8:01

GoogleCodeExporter commented 9 years ago
Sorry should have put this in the first post, i tried both compile from source 
and create an RPM.  If i do a

rpmbuild -ta pdsh-2.24.tar.bz2

Here is what happens during the test phase.

ok 1 - working success
ok 2 - test runs if prerequisite is satisfied
ok 3 - tests clean up after themselves
ok 4 - tests clean up even after a failure
ok 5 - failure to clean up causes the test to fail
ok 6 - pdsh runs
ok 7 - pdsh -V works
ok 8 - pdsh -L works
ok 9 - pdsh -h works
ok 10 - rcmd/exec module is built
ok 11 - pdsh -N option works
ok 12 # skip -u option is functional (missing LONGTESTS)
ok 13 - -f sets fanout
ok 14 - -l sets remote username
ok 15 - -t sets connect timeout
ok 16 - -u sets command timeout
ok 17 - command timeout 0 by default
ok 18 - -b enables batch mode
ok 19 - pdsh -N option works
# passed all 19 test(s)
1..19
PASS: ./t0001-basic.sh
ok 1 - working xstrerrorcat
ok 2 - working pipecmd
# passed all 2 test(s)
1..2
PASS: ./t0002-internal.sh
ok 1 - hostname range expansion works
ok 2 - host range expansion does not strip leading zeros
ok 3 - host range expansion handles mixed size suffixes
ok 4 - host range expansion works with "," embedded in range
ok 5 - host range expansion works with 2 sets of brackets
ok 6 - pdsh -x option works
ok 7 - pdsh -x option works with ranges
ok 8 - pdsh -x option works with ranges (gnats:118)
ok 9 - pdsh -x option works with non-numeric suffix (gnats:120)
ok 10 - pdsh -w- reads from stdin
ok 11 - pdsh -w- can be used with other -w args
ok 12 - WCOLL environment variable works
ok 13 - ranges can be embedded in wcoll files
ok 14 - ^file works
ok 15 - -x ^file works
ok 16 - ^file works with other args
ok 17 - Multiple ^file args
ok 18 - Multiple -w^file
ok 19 - -^file excludes hosts in file
ok 20 - ^file errors out if file doesnt exist
ok 21 - host exclusion with "-" works
ok 22 - regex filtering works
ok 23 - regex exclusion works
ok 24 - regex exclusion works from -x
ok 25 - multiple -w options
# passed all 25 test(s)
1..25
PASS: ./t0003-wcoll.sh
ok 1 - PDSH_MODULE_DIR functionality
ok 2 - module A takes precedence over B
ok 3 - pdsh -M B ativates module B
ok 4 - PDSH_MISC_MODULES option works
ok 5 - -M option overrides PDSH_MISC_MODULES environment var
not ok - 6 pdsh help string correctly displays options of loaded modules
#
#               OUTPUT=$(pdsh -h 2>&1 | grep ^-a) &&
#               test_output_matches "$OUTPUT" "Module A" &&
#               OUTPUT=$(pdsh -M B -h 2>&1 | grep ^-a) &&
#               test_output_matches "$OUTPUT" "Module B"
#
not ok - 7 Loading conflicting module with -M causes error
#
#               OUTPUT=$(pdsh -MA,B 2>&1 | grep Warning)
#               test_output_matches "$OUTPUT" \
#                       "Failed to initialize requested module \"misc/B\""
#
ok 8 - Conflicting modules dont run init()
not ok - 9 Force loaded module runs init()
#
#           PDSH_MODULE_DIR=$TEST_DIRECTORY/test-modules
#           if ! pdsh -q -MB 2>&1 | grep "B: in init"; then
#                   say_color error "Error: init routine for module B not run 
with -M B"
#                       false
#           fi
#
ok 10 - New conflicting module does not run init() with -M
# failed 3 among 10 test(s)
1..10
FAIL: ./t0004-module-loading.sh
ok 1 - pdsh -l sets username for all hosts
ok 2 - Can set remote username via user@hosts
ok 3 - user@hosts works for a subset of hosts
ok 4 - Can set rcmd_type via rcmd_type:hosts
ok 5 - Can set rcmd_type and user via rcmd_type:user@hosts
# passed all 5 test(s)
1..5
PASS: ./t0005-rcmd_type-and-user.sh
ok 1 - Creating pdcp link to pdsh binary
ok 2 - Creating rpdcp link to pdsh binary
ok 3 - pdcp runs
ok 4 - rpdcp runs
ok 5 - pdcp -V works
ok 6 - pdcp -q works
ok 7 - -e sets remote program path
ok 8 - PDSH_REMOTE_PDCP_PATH sets remote program path
ok 9 - -f sets fanout
ok 10 - -l sets remote username
ok 11 - -t sets connect timeout
ok 12 - -u sets command timeout
ok 13 - command timeout 0 by default
not ok - 14 Have pcptest rcmd module
#
#               PDSH_MODULE_DIR=$T pdcp -L | grep -q pcptest
#
not ok - 15 pdcp basic functionality
#
#           HOSTS="host[0-10]"
#               setup_host_dirs "$HOSTS" &&
#               test_when_finished "rm -rf host* testfile" &&
#               create_random_file testfile 10 &&
#               PDSH_MODULE_DIR=$T pdcp -Rpcptest -w "$HOSTS" testfile testfile 
&&
#               pdsh -SRexec -w "$HOSTS" diff -q testfile %h/testfile
#
not ok - 16 rpdcp basic functionality
#
#               HOSTS="host[0-10]"
#               setup_host_dirs "$HOSTS"
#               test_when_finished "rm -rf host* t output" &&
#               pdsh -Rexec -w "$HOSTS" dd if=/dev/urandom of=%h/t bs=1024 
count=10 >/dev/null 2>&1 &&
#               mkdir output &&
#               PDSH_MODULE_DIR=$T rpdcp -Rpcptest -w "$HOSTS" t output/ &&
#               pdsh -SRexec -w "$HOSTS" diff -q output/t.%h %h/t
#
ok 17 - initialize directory tree
not ok - 18 pdcp -r works
#
#               HOSTS="host[0-10]"
#               setup_host_dirs "$HOSTS" &&
#               test_when_finished "rm -rf host*" &&
#               PDSH_MODULE_DIR=$T pdcp -Rpcptest -w "$HOSTS" -r tree . &&
#               pdsh -SRexec -w "$HOSTS" diff -Nqr tree %h/tree &&
#               pdsh -SRexec -w "$HOSTS" test -x tree/baz/exec.sh &&
#               pdsh -SRexec -w "$HOSTS" test -h tree/foo.link &&
#               pdsh -SRexec -w "$HOSTS" test ! -w dir/a/b/c/xw
#
not ok - 19 rpdcp -r works
#
#               HOSTS="host[0-10]"
#               setup_host_dirs "$HOSTS" &&
#               test_when_finished "rm -rf host* output" &&
#               pdsh -SRexec -w "$HOSTS" cp -a tree %h/ &&
#               mkdir output &&
#               PDSH_MODULE_DIR=$T rpdcp -Rpcptest -w "$HOSTS" -r tree output/ 
&&
#               pdsh -SRexec -w "$HOSTS" diff -Nqr tree output/tree.%h
#
# failed 5 among 19 test(s)
1..19
FAIL: ./t0006-pdcp.sh
ok 1 - dshbak functionality
ok 2 - dshbak -c does not coalesce different length output
ok 3 - dshbak -c properly compresses multi-digit suffixes
ok 4 - dshbak -c properly compresses prefix with embedded numerals
ok 5 - dshbak -c does not strip leading zeros
ok 6 - dshbak -c does not coalesce different zero padding
ok 7 - dshbak -c properly coalesces zero padding of "00"
ok 8 - dshbak -c can detect suffixes
not ok 9 - dshbak -c can detect suffix with numeral # TODO known breakage
ok 10 - dshbak -d functionality
ok 11 - dshbak -f functionality
ok 12 - dshbak -f without -d fails
ok 13 - dshbak -d fails when output dir does not exist
not ok - 14 dshbak -d fails gracefully for non-writable dir
#
#         mkdir test_output &&
#         chmod 500 test_output &&
#         echo -e "foo0: bar" | dshbak -d test_output 2>&1 | tee logfile | \
#            grep "Failed to open output file"  &&
#         rm -rf test_output logfile
#
# still have 1 known breakage(s)
# failed 1 among remaining 13 test(s)
1..14
FAIL: ./t1000-dshbak.sh
# passed all 0 test(s)
1..0 # SKIP skipping genders tests, genders module not available
PASS: ./t1001-genders.sh
ok 1 - dshgroup options are active
ok 2 - dshgroup -g option works
ok 3 - dshgroup -g option works with more than one group
ok 4 - dshgroup -X option works
ok 5 - dshgroup -X option works with -w
# passed all 5 test(s)
1..5
PASS: ./t1002-dshgroup.sh
# passed all 0 test(s)
1..0 # SKIP skipping slurm tests, slurm module not available
PASS: ./t1003-slurm.sh
====================
3 of 10 tests failed
====================
make[3]: *** [check-TESTS] Error 1
make[3]: Leaving directory `/usr/src/redhat/BUILD/pdsh-2.24/tests'
make[2]: *** [check-am] Error 2
make[2]: Leaving directory `/usr/src/redhat/BUILD/pdsh-2.24/tests'
make[1]: *** [check-recursive] Error 1
make[1]: Leaving directory `/usr/src/redhat/BUILD/pdsh-2.24/tests'
make: *** [check-recursive] Error 1
error: Bad exit status from /var/tmp/rpm-tmp.29252 (%build)

RPM build errors:
    Bad exit status from /var/tmp/rpm-tmp.29252 (%build)

Original comment by iffla...@gmail.com on 4 Mar 2011 at 8:20

GoogleCodeExporter commented 9 years ago
What is the output of 

 ./pdsh -V

after you've done the ./configure ... && make ?

I am also on RHEL5.5 x86_64. I will try to download directly from the google 
code
site and see if I can reproduce these issues.

Thanks for the report

Original comment by mark.gro...@gmail.com on 4 Mar 2011 at 10:41

GoogleCodeExporter commented 9 years ago
Here is the output of the pdsh -V

[root@galaxy tmp]# pdsh -V
pdsh-2.24 (+readline)
rcmd modules: ssh
misc modules: machines,dshgroup

Let me know what else i can do.  I really like the work you do for pdsh and it 
is a great tool.

Thanks,

Sean

Original comment by iffla...@gmail.com on 4 Mar 2011 at 10:55

GoogleCodeExporter commented 9 years ago
I am not able to reproduce this issue, unfortunately.

One problem seems to be that pdcp seems to want to run as if it was pdsh
(they are the same binary, pdsh just behaves in pdcp mode when argv[0] == pdcp)

The other problem is all the unexpectedly failing tests. I tried building an RPM
on RHEL5 and didn't see any failures in the testsuite, so my only guess is that
there is some prerequisite needed by the testsuite that isn't properly checked,
or some other assumption made by the testsuite that isn't true on your system.

Can you try doing

 (cd tests && sh -x ./t0004-module-loading.sh)

and attach the output to this issue?

Thanks!
mark

Original comment by mark.gro...@gmail.com on 4 Mar 2011 at 11:01

GoogleCodeExporter commented 9 years ago
Ok, let's tackle the pdcp issue first. Can you paste the output of:

 1. id
 2. which pdcp
 3. pdcp -h
 4. pdcp -w galaxy -q /tmp/me /var/tmp/me
 5. strace -e file pdcp -w galaxy /tmp/me /var/tmp/me

Hopefully at least one of those bits of information will give us a clue about 
why pdcp is failing to act like pdcp.

Original comment by mark.gro...@gmail.com on 4 Mar 2011 at 11:45

GoogleCodeExporter commented 9 years ago
Are you running the testsuite/rpmbuild as root? That might explain the tests
that are failing. The testsuite makes use of PDSH_MODULE_DIR in several places,
and this will not work when pdsh is run setuid or as root.

I'll add some checks to the test framework to ensure tests that need to run
non-root are skipped when the testsuite is run as root.

Original comment by mark.gro...@gmail.com on 5 Mar 2011 at 12:04

GoogleCodeExporter commented 9 years ago
Sorry i was gone all weekend.  First you are correct, i built is as my user and 
it built fine.  I still get the pdcp problem.  Here is the output you asked for 
above:

[root@galaxy tmp]# id
uid=0(root) gid=0(root) 
groups=0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel)
[root@galaxy tmp]# which pdcp
/usr/bin/pdcp
[root@galaxy tmp]# pdcp -h
Usage: pdcp [-options] src [src2...] dest
-r                recursively copy files
-p                preserve modification time and modes
-e PATH           specify the path to pdcp on the remote machine
-h                output usage menu and quit
-V                output version information and quit
-q                list the option settings and quit
-b                disable ^C status feature (batch mode)
-d                enable extra debug information from ^C status
-l user           execute remote commands as user
-t seconds        set connect timeout (default is 10 sec)
-u seconds        set command timeout (no default)
-f n              use fanout of n nodes
-w host,host,...  set target node list on command line
-x host,host,...  set node exclusion list on command line
-R name           set rcmd module to name
-M name,...       select one or more misc modules to initialize first
-N                disable hostname: labels on output lines
-L                list info on all loaded modules and exit
-g groupname      target hosts in dsh group "groupname"
-X groupname      exclude hosts in dsh group "groupname"
-a                target all nodes
available rcmd modules: ssh
[root@galaxy tmp]# pdcp -w galaxy -q /tmp/me /var/tmp/me
-- PCP-specific options --
Infile(s)               /tmp/me
Outfile                 /var/tmp/me
Recursive               No
Preserve mod time/mode  No
Full program pathname   /usr/bin/pdcp
Remote program path     /usr/bin/pdcp

-- Generic options --
Local username          root
Local uid               0
Remote username         root
Rcmd type               ssh
one ^C will kill pdsh   No
Connect timeout (secs)  10
Command timeout (secs)  0
Fanout                  32
Display hostname labels Yes
Debugging               No

-- Target nodes --
galaxy
[root@galaxy tmp]# strace -e file pdcp -w galaxy /tmp/me /var/tmp/me
execve("/usr/bin/pdcp", ["pdcp", "-w", "galaxy", "/tmp/me", "/var/tmp/me"], [/* 
36 vars */]) = 0
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
open("/usr/lib64/libreadline.so.5", O_RDONLY) = 3
open("/usr/lib64/libhistory.so.5", O_RDONLY) = 3
open("/usr/lib64/libncurses.so.5", O_RDONLY) = 3
open("/lib64/libdl.so.2", O_RDONLY)     = 3
open("/lib64/libpthread.so.0", O_RDONLY) = 3
open("/lib64/libc.so.6", O_RDONLY)      = 3
open("/etc/nsswitch.conf", O_RDONLY)    = 3
open("/etc/ld.so.cache", O_RDONLY)      = 3
open("/lib64/libnss_files.so.2", O_RDONLY) = 3
open("/etc/passwd", O_RDONLY)           = 3
getcwd("/tmp"..., 4096)                 = 5
access("/usr/lib/oracle/10.2.0.4/client64/bin/pdcp", R_OK) = -1 ENOENT (No such 
file or directory)
access("/usr/kerberos/sbin/pdcp", R_OK) = -1 ENOENT (No such file or directory)
access("/usr/kerberos/bin/pdcp", R_OK)  = -1 ENOENT (No such file or directory)
access("/usr/local/sbin/pdcp", R_OK)    = -1 ENOENT (No such file or directory)
access("/usr/local/bin/pdcp", R_OK)     = -1 ENOENT (No such file or directory)
access("/sbin/pdcp", R_OK)              = -1 ENOENT (No such file or directory)
access("/bin/pdcp", R_OK)               = -1 ENOENT (No such file or directory)
access("/usr/sbin/pdcp", R_OK)          = -1 ENOENT (No such file or directory)
access("/usr/bin/pdcp", R_OK)           = 0
stat("/usr/bin/pdcp", {st_mode=S_IFREG|0755, st_size=161709, ...}) = 0
stat("/", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat("/usr/lib64/pdsh", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat("/usr/lib64/pdsh/..", {st_mode=S_IFDIR|0755, st_size=36864, ...}) = 0
stat("/usr/lib64/pdsh/../..", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat("/usr/lib64/pdsh/../../..", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
open("/usr/lib64/pdsh", O_RDONLY|O_NONBLOCK|O_DIRECTORY) = 3
stat("/usr/lib64/pdsh/dshgroup.so", {st_mode=S_IFREG|0755, st_size=22854, ...}) 
= 0
open("/usr/lib64/pdsh/dshgroup.so", O_RDONLY) = 4
stat("/usr/lib64/pdsh/..", {st_mode=S_IFDIR|0755, st_size=36864, ...}) = 0
stat("/usr/lib64/pdsh/machines.so", {st_mode=S_IFREG|0755, st_size=12735, ...}) 
= 0
open("/usr/lib64/pdsh/machines.so", O_RDONLY) = 4
stat("/usr/lib64/pdsh/.", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat("/usr/lib64/pdsh/sshcmd.so", {st_mode=S_IFREG|0755, st_size=21351, ...}) = 
0
open("/usr/lib64/pdsh/sshcmd.so", O_RDONLY) = 4
stat("/tmp/me", {st_mode=S_IFREG|0740, st_size=27, ...}) = 0
access("/tmp/me", R_OK)                 = 0
stat("/tmp/me", {st_mode=S_IFREG|0740, st_size=27, ...}) = 0
pdcp@galaxy: galaxy: error: sparky
open("/etc/ld.so.cache", O_RDONLY)      = 3
open("/lib64/libgcc_s.so.1", O_RDONLY)  = 3
--- SIGCHLD (Child exited) @ 0 (0) ---
[root@galaxy tmp]# 

Thanks,

Sean

Original comment by iffla...@gmail.com on 7 Mar 2011 at 8:57

GoogleCodeExporter commented 9 years ago
Everything above looks good to me. I notice it looks like you are getting a 
different
error when running under strace?

Can you also try

 strace -s1024 -fe execve pdcp ....

That should show us what ssh command line pdcp is running.

Thanks,
mark

Original comment by mark.gro...@gmail.com on 7 Mar 2011 at 9:58

GoogleCodeExporter commented 9 years ago
Here is the output of the strace command.  I am using the default RHEL ssh 
located at /usr/bin/ssh

[root@galaxy tmp]# strace -s1024 -fe execve pdcp -w galaxy /tmp/me /var/tmp/me
execve("/usr/bin/pdcp", ["pdcp", "-w", "galaxy", "/tmp/me", "/var/tmp/me"], [/* 
36 vars */]) = 0
Process 3224 attached
Process 3225 attached (waiting for parent)
Process 3225 resumed (parent 3223 ready)
Process 3226 attached (waiting for parent)
Process 3226 resumed (parent 3223 ready)
Process 3227 attached
[pid  3227] execve("/usr/lib/oracle/10.2.0.4/client64/bin/ssh", ["ssh", 
"-oConnectTimeout=10", "-2", "-a", "-x", "-lroot", "galaxy", "/tmp/me", 
"/var/tmp/me"], [/* 37 vars */]) = -1 ENOENT (No such file or directory)
[pid  3227] execve("/usr/kerberos/sbin/ssh", ["ssh", "-oConnectTimeout=10", 
"-2", "-a", "-x", "-lroot", "galaxy", "/tmp/me", "/var/tmp/me"], [/* 37 vars 
*/]) = -1 ENOENT (No such file or directory)
[pid  3227] execve("/usr/kerberos/bin/ssh", ["ssh", "-oConnectTimeout=10", 
"-2", "-a", "-x", "-lroot", "galaxy", "/tmp/me", "/var/tmp/me"], [/* 37 vars 
*/]) = -1 ENOENT (No such file or directory)
[pid  3227] execve("/usr/local/sbin/ssh", ["ssh", "-oConnectTimeout=10", "-2", 
"-a", "-x", "-lroot", "galaxy", "/tmp/me", "/var/tmp/me"], [/* 37 vars */]) = 
-1 ENOENT (No such file or directory)
[pid  3227] execve("/usr/local/bin/ssh", ["ssh", "-oConnectTimeout=10", "-2", 
"-a", "-x", "-lroot", "galaxy", "/tmp/me", "/var/tmp/me"], [/* 37 vars */]) = 
-1 ENOENT (No such file or directory)
[pid  3227] execve("/sbin/ssh", ["ssh", "-oConnectTimeout=10", "-2", "-a", 
"-x", "-lroot", "galaxy", "/tmp/me", "/var/tmp/me"], [/* 37 vars */]) = -1 
ENOENT (No such file or directory)
[pid  3227] execve("/bin/ssh", ["ssh", "-oConnectTimeout=10", "-2", "-a", "-x", 
"-lroot", "galaxy", "/tmp/me", "/var/tmp/me"], [/* 37 vars */]) = -1 ENOENT (No 
such file or directory)
[pid  3227] execve("/usr/sbin/ssh", ["ssh", "-oConnectTimeout=10", "-2", "-a", 
"-x", "-lroot", "galaxy", "/tmp/me", "/var/tmp/me"], [/* 37 vars */]) = -1 
ENOENT (No such file or directory)
[pid  3227] execve("/usr/bin/ssh", ["ssh", "-oConnectTimeout=10", "-2", "-a", 
"-x", "-lroot", "galaxy", "/tmp/me", "/var/tmp/me"], [/* 37 vars */]) = 0
pdcp@galaxy: galaxy: error: sparky
Process 3227 detached
Process 3226 detached
[pid  3225] --- SIGRTMIN (Unknown signal 32) @ 0 (0) ---
[pid  3223] --- SIGCHLD (Child exited) @ 0 (0) ---
Process 3225 detached
[root@galaxy tmp]#

Thanks,

Sean

Original comment by iffla...@gmail.com on 7 Mar 2011 at 10:12

GoogleCodeExporter commented 9 years ago
Ok, it appears that pdcp is running

 ssh -oConnectTimeout=10 -2 -a -x -lroot galaxy /tmp/me /var/tmp/me

which is not correct. The command line should be

 ssh -oConnectTimeout=10 -2 -a -x -lroot galaxy /usr/bin/pdcp -z /var/tmp/me

i.e. pdcp typically uses ssh to run the pdcp server (pdcp -z) on the remote
system. Do you have any PDSH environment variables set?

 env | grep PDSH

Original comment by mark.gro...@gmail.com on 7 Mar 2011 at 10:18

GoogleCodeExporter commented 9 years ago
Here are our environmental variables,.

[root@galaxy tmp]# env | grep PDSH
PDSH_WITHOUT_OPTIONS=netgroups
PDSH_RCMD_TYPE=ssh

Thanks,

Sean

Original comment by iffla...@gmail.com on 7 Mar 2011 at 10:37

GoogleCodeExporter commented 9 years ago
Ok, I think I found the problem, which was due to the following commit (r1262):

Author: mark.grondona <mark.grondona@bda7ed58-bd3f-82c7-2090-e0c2208e3b03>
Date:   Sat Feb 12 16:42:26 2011 +0000

    Preserve argument list in ssh rcmd module

    Pass remote command argument list to ssh in the same form
    it was passed to pdsh, instead of collapsing all args into
    a single command string.

    For example, when pdsh was called like so:

      pdsh -Rssh -w host0 cmd arg1 arg2 ...

    it would invoke ssh like this:

      ssh -2 -a -x -l user host0 "cmd arg1 arg2 .."

    i.e. ssh will see cmd and its args as a single argument. After
    this fix, ssh would be invoked as:

     ssh -2 -a -x -l user host0 cmd arg1 arg2 ..

    i.e ssh will see cmd and it args as separate arguments. (Since
    pdsh exec's ssh, these args are not subject to shell expansion
    again until they are invoked by the shell on the remote host)

    This change should only make a difference in cases where there
    is a difference between

     ssh host "cmd args..."

    and

     ssh host "cmd" "arg1" "arg2" ...

I will attach a patch that essentially reverts this change. If you don't mind
could you test it? Before I commit the official fix I need to write several 
tests
for all the things this commit broke :-O

Original comment by mark.gro...@gmail.com on 7 Mar 2011 at 11:24

GoogleCodeExporter commented 9 years ago

Original comment by mark.gro...@gmail.com on 7 Mar 2011 at 11:25

Attachments:

GoogleCodeExporter commented 9 years ago
I patched the 2.24.tar.bz2 and installed it and it worked.  I rebuilt the 
package with rpmbuild -ta pdsh.2.24.tar.bz2 as my own user and it build without 
any issues.  Below is an output of the a pdcp, just an FYI.  Thanks again for 
your great support.

[root@galaxy tmp]# pdcp -w galaxy -q /tmp/me /var/tmp/me
-- PCP-specific options --
Infile(s)               /tmp/me
Outfile                 /var/tmp/me
Recursive               No
Preserve mod time/mode  No
Full program pathname   /usr/bin/pdcp
Remote program path     /usr/bin/pdcp

-- Generic options --
Local username          root
Local uid               0
Remote username         root
Rcmd type               ssh
one ^C will kill pdsh   No
Connect timeout (secs)  10
Command timeout (secs)  0
Fanout                  32
Display hostname labels Yes
Debugging               No

-- Target nodes --
galaxy
[root@galaxy tmp]# pdcp -w galaxy /tmp/me /var/tmp/me   
[root@galaxy tmp]# ll /var/tmp/me
-rwx------ 1 root root 27 Mar  8 08:17 /var/tmp/me
[root@galaxy tmp]#

Thanks,

Sean

Original comment by iffla...@gmail.com on 8 Mar 2011 at 2:20

GoogleCodeExporter commented 9 years ago
If you need me to test anything else for this just let me know.

Thanks,

Sean

Original comment by iffla...@gmail.com on 8 Mar 2011 at 2:21

GoogleCodeExporter commented 9 years ago
Great! Thanks for testing.
Your result verifies that the argument processing changes for ssh broke pdcp 
usage.
I'm trying to decide if I just want to revert the change as in the patch posted 
here,
or fix the issues that the change generated. Therefore, I may post another, 
different
patch here for testing.

Also, I'm going to create a separate issue for the problems running the 
testsuite as
root.

Thanks again,
mark

Original comment by mark.gro...@gmail.com on 8 Mar 2011 at 2:42

GoogleCodeExporter commented 9 years ago
This is likely the patch I will commit to fix this issue.
If you have time, could you give it a test? I will likely push some
changes to pdsh/trunk tomorrow, then release a pdsh-2.25 shortly
thereafter.

Original comment by mark.gro...@gmail.com on 9 Mar 2011 at 12:05

Attachments:

GoogleCodeExporter commented 9 years ago
I applied the new patch, recompiled and reinstalled.  pdsh and pdcp appear to 
work as they should.  I ran some copies and remote pdsh commands and didn't 
have any problems.

The only test it failed on the recompile is a known breakage:
not ok 9 - dshbak -c can detect suffix with numeral # TODO known breakage

If you need me to test anything else let me know.

Thanks,

Sean 

Original comment by iffla...@gmail.com on 9 Mar 2011 at 2:02

GoogleCodeExporter commented 9 years ago
This issue was updated by revision r1304.

Added test to pdsh testuite for {r}pdcp functionality with rcmd/ssh.

Original comment by mark.gro...@gmail.com on 9 Mar 2011 at 3:25

GoogleCodeExporter commented 9 years ago
This issue was closed by revision r1305.

Original comment by mark.gro...@gmail.com on 9 Mar 2011 at 3:25

GoogleCodeExporter commented 9 years ago
Issue 41 has been merged into this issue.

Original comment by mark.gro...@gmail.com on 19 Oct 2011 at 4:53