hjmangalam / parsyncfp2

MultiHost parallel rsync wrapper
Other
43 stars 6 forks source link

Execution of remote parsyncfp2 on SEND host fails if user's shell does not support `export` #7

Open gt3M opened 1 year ago

gt3M commented 1 year ago

https://github.com/hjmangalam/parsyncfp2/blob/476e3b6562e4fff865f9606d7d612df7b6e1ac37/parsyncfp2#L277-L279

@hjmangalam I'm curious as to your thoughts on approaches to support the case in which the remote user does not have a default shell that supports export (so, neither sh not bash nor ash nor ksh nor zsh, etc, but a C-based shell). I'm willing to submit a patch. One approach would be to get the remote shell when performing a checkhost, then building RPATHPFX using the appropriate command for setting environment variables. Another approach, though more invasive, could be to use ssh's SendEnv option to pass PATH to the remote host; downside is that the remote sshd has to be configured to AcceptEnv PATH, which some might consider a security problem. Lastly, wrapping the whole remote command in bash -c might be possible, though it could also introduce more quoting issues. I suppose one last option could be to not include the setting of PATH in the remote command at all and just assume that the requisite commands are in the remote user's default path.

hjmangalam commented 1 year ago

Hmmmmmm. If you look at the help output, you can see that I have included an option '--rpath' for this, but was moving towards removing it. Maybe it IS still needed for this purpose. Look for $RPATH, $RPATHPFX to see how it's used - pretty simple. My blindered assumption was that the remote shell would support 'export' and as you say, tcsh/csh do not. I'm going to guess that you use these..? I'd like to stay away from messing with users' ssh settings if possible. pfp2 does probe the remote configs prior to setting up various things, so it would be easy to ask for the remote shell and configure the command to match it. I think that would be my preferred way of going about it now. I'm busy with other things today, but I'll try to work on it tomorrow. Thanks again! Harry

On Sun, Mar 19, 2023 at 10:43 AM Gabe T. @.***> wrote:

https://github.com/hjmangalam/parsyncfp2/blob/476e3b6562e4fff865f9606d7d612df7b6e1ac37/parsyncfp2#L277-L279

@hjmangalam https://github.com/hjmangalam I'm curious as to your thoughts on approaches to support the case in which the remote user does not have a default shell that supports export (so, neither sh not bash nor ash nor ksh nor zsh, etc, but a C-based shell). I'm willing to submit a patch. One approach would be to get the remote shell when performing a checkhost, then building RPATHPFX using the appropriate command for setting environment variables. Another approach, though more invasive, could be to use ssh's SendEnv option to pass PATH to the remote host; downside is that the remote sshd has to be configured to AcceptEnv PATH, which some might consider a security problem. Lastly, wrapping the whole remote command in bash -c might be possible, though it could also introduce more quoting issues. I suppose one last option could be to not include the setting of PATH in the remote command at all and just assume that the requisite commands are in the remote user's default path.

— Reply to this email directly, view it on GitHub https://github.com/hjmangalam/parsyncfp2/issues/7, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASF3Y32UY6KN5HCUAPNXMLW45AVBANCNFSM6AAAAAAWAHVO4I . You are receiving this because you were mentioned.Message ID: @.***>

--

Harry Mangalam

gt3M commented 1 year ago

As far as I can tell, --rpath allows one to specify a file path that is prepended to PATH, but PATH is still being set using export. I should have noticed it earlier, having seen the parsyncfp2 print the command it would execute via ssh many times, but I only encountered the issue when attempting to use a remote user whose shell is tcsh. I did some experiments, including trying to use /usr/bin/env PATH=... instead, but have not yet gotten it to work (the ssh commands exit, either because parsyncfp2 on the send hosts is failing or because ssh is failing; it's not yet clear). I will continue to experiment.

gt3M commented 1 year ago

Here, the check for Host in the remote user's .ssh/config file is also sh-centric. Fortunately, it's a simple check and could use test instead, I suspect:

qq(ssh $fqrhs 'test -e ~/.ssh/config && grep Host ~/.ssh/config' 2> /dev/null);

There will still be a problem if the user that calls initial parsyncfp2 calls it in (t)csh as the 2> won't work. If the goal is to just check if Host is in ~/.ssh/config and the lines containing Host don't need to be printed (i.e. the return code of grep is what's important), then redirecting stderr to /dev/null could be omitted. Else, both bash and tcsh support >& to redirect to a file, e.g.

qq(ssh $fqrhs 'test -e ~/.ssh/config && grep Host ~/.ssh/config' >& /dev/null);
gt3M commented 1 year ago

Turns out I'm wrong on 2> not working if the initial parsyncfp2 is called from tcsh. Both system and backticks in perl will use sh -c, so 2> really does redirect stderr even if the perl program is called from tcsh. I do think that in the case in which the check is only for Host in the remote ~/.ssh/config can be done by just checking the return code of grep, i.e.

$checkcmd = qq(ssh $fqrhs 'test -e ~/.ssh/config && grep Host ~/.ssh/config' 2>/dev/null);
if ( system($checkcmd) == 0 ) {
  WARN("[$fqrhs] has a ~/.ssh/config file containing 'Host' definitions, ...")
} else {
  if ( $VERBOSE > 2 ) { INFO("SEND Host [$fqrhs] No conflicting ~/.ssh/config.\n") }
}
hjmangalam commented 1 year ago

Just pushed the fix to what you were asking about. Works on Ubuntu with a SEND host using tcsh, not sure about distros. harry

On Sun, Mar 19, 2023 at 3:08 PM Gabe T. @.***> wrote:

Turns out I'm wrong on 2> not working if the initial parsyncfp2 is called from tcsh. Both system and backticks in perl will use sh -c, so 2> really does redirect stderr even if the perl program is called from tcsh. I do think that in the case in which the check is only for Host in the remote ~/.ssh/config can be done by just checking the return code of grep, i.e.

$checkcmd = qq(ssh $fqrhs 'test -e ~/.ssh/config && grep Host ~/.ssh/config' 2>/dev/null); if ( system($checkcmd) != 0 ) { WARN("[$fqrhs] has a ~/.ssh/config file containing 'Host' definitions, ...") } else { if ( $VERBOSE > 2 ) { INFO("SEND Host [$fqrhs] No conflicting ~/.ssh/config.\n") } }

— Reply to this email directly, view it on GitHub https://github.com/hjmangalam/parsyncfp2/issues/7#issuecomment-1475413550, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASF3Y3ZUYYDRGKGC5KTU5DW457W3ANCNFSM6AAAAAAWAHVO4I . You are receiving this because you were mentioned.Message ID: @.***>

--

Harry Mangalam

gt3M commented 1 year ago

@hjmangalam Thanks! The csh support looks promising. Unfortunately, it seems that the --ro changes may now have made it such that --ro is a required option. I get this when attempting to use 2.56, having not specified --ro at all:

** FATAL ERROR **: You didn't have a '-' in your '--ro' string.
  You need to supply the options exactly as you would with rsync, with
  leading '-' or '--'. Try again..

Command I'm using is like:

parsyncfp2 --verbose=3 --nowait --NP=4 --chunksize=100M --commondir=${PWD}/pfp2 --startdir=${PWD} --hosts=${HOSTNAME}=hpcc-dx01-pr Downloads POD::/homes/01/${USER}

I know that this is a multihost-style command-line but a singlehost-style transfer, but it was an easy way to reproduce the problem. If I use the singlehost-style command, like the following, I do not get the --ro error:

parsyncfp2 --verbose=3 --nowait --NP=4 --chunksize=100M --commondir=${PWD}/pfp2 --startdir=${PWD} Downloads hpcc-dx01-pr:/homes/01/${USER}
hjmangalam commented 1 year ago

Ugh. Pushed too fast. Corrected. Sorry about that. Harry

On Tue, Mar 21, 2023 at 8:13 AM Gabe T. @.***> wrote:

@hjmangalam https://github.com/hjmangalam Thanks! The csh support looks promising. Unfortunately, it seems that the --ro changes may now have made it such that --ro is a required option. I get this when attempting to use 2.56, having not specified --ro at all:

FATAL ERROR : You didn't have a '-' in your '--ro' string. You need to supply the options exactly as you would with rsync, with leading '-' or '--'. Try again..

— Reply to this email directly, view it on GitHub https://github.com/hjmangalam/parsyncfp2/issues/7#issuecomment-1478018375, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASF3Y5B4QK2XQ4AOSIPUWTW5HATHANCNFSM6AAAAAAWAHVO4I . You are receiving this because you were mentioned.Message ID: @.***>

--

Harry Mangalam

gt3M commented 1 year ago

@hjmangalam No worries. It's working with tcsh!

I'm still testing the --ro fixes. I'm running it right now with --ro="'-asl --excludes-from=${excludes_file_path}'" and rsync does not seem to be complaining. As you can see, the one downside to using a singe-quoted value is one needs to then wrap that in double-quotes to do any variable interpolation for the arguments being passed to rsync. I think it's working as all of my rsyncs are being run via ssh. When I get a chance, I can also test with a singlehost copy to see if double-quoting the single quotes will be an issue. It's possible I may need to eval and echo of the string to strip the double-quotes.

hjmangalam commented 1 year ago

Agreed - the ongoing problem with interpolated vars. I think there's a line where I have to prefix a var with 4 or 5 '\'s to get it to interpolate correctly. Let me know when this is a problem and I'll /try/ to fix it but it may be an issue for you to wrap it in enough quote and backslashes to get it to work..

On Tue, Mar 21, 2023 at 11:11 AM Gabe T. @.***> wrote:

@hjmangalam https://github.com/hjmangalam No worries. It's working with tcsh!

I'm still testing the --ro fixes. I'm running it right now with --ro="'-asl --excludes-from=${excludes_file_path}'" and rsync does not seem to be complaining. As you can see, the one downside to using a singe-quoted value is one needs to then wrap that in double-quotes to do any variable interpolation for the arguments being passed to rsync. I think it's working as all of my rsyncs are being run via ssh. When I get a chance, I can also test with a singlehost copy to see if double-quoting the single quotes will be an issue. It's possible I may need to eval and echo of the string to strip the double-quotes.

— Reply to this email directly, view it on GitHub https://github.com/hjmangalam/parsyncfp2/issues/7#issuecomment-1478372618, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASF3YZWBIHSVB2FQLPL2L3W5HVOPANCNFSM6AAAAAAWAHVO4I . You are receiving this because you were mentioned.Message ID: @.***>

--

Harry Mangalam

gt3M commented 1 year ago

At least for the rsync options, another approach could be to give up on trying to preserve the default whitespace delimiter and have the user pass them in as a comma-delimited list, e.g.

parsyncfp2 --ro=-asl,--exclude-from=/path/to/file ...

It's been a long time since I've used Perl's GetOpt::Long, so I'm not sure if that will trip it up, or if it would correctly know that because --ro is preceded by a word boundary everything after the = is the value.

hjmangalam commented 1 year ago

Good suggestion. I'll put it on the TOTRY list. Just pushed the FS stuff. All the time I can afford today. harry

On Tue, Mar 21, 2023 at 11:32 AM Gabe T. @.***> wrote:

At least for the rsync options, another approach could be to give up on trying to preserve the default whitespace delimiter and have the user pass them in as a comma-delimited list, e.g.

parsyncfp2 --ro=-asl,--exclude-from=/path/to/file ...

It's been a long time since I've used Perl's GetOpt::Long, so I'm not sure if that will trip it up, or if it would correctly know that because --ro is preceded by a word boundary everything after the = is the value.

— Reply to this email directly, view it on GitHub https://github.com/hjmangalam/parsyncfp2/issues/7#issuecomment-1478400155, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASF3YY4SLS6PCSN62PL5VTW5HX35ANCNFSM6AAAAAAWAHVO4I . You are receiving this because you were mentioned.Message ID: @.***>

--

Harry Mangalam

gt3M commented 1 year ago

@hjmangalam Apologies for bothering you again. With your latest commit, I am again getting the following:

** FATAL ERROR **: You didn't have a '-' in your '--ro' string.
  You need to supply the options exactly as you would with rsync, with
  leading '-' or '--'. Try again..

So it seems that the fix for the --ro fix got unfixed :)

hjmangalam commented 1 year ago

I cannot believe how that thing keeps coming back like some kind of zombie. I think I've finally blown it's head off. Pushed. Harry

On Tue, Mar 21, 2023 at 5:01 PM Gabe T. @.***> wrote:

@hjmangalam https://github.com/hjmangalam Apologies for bothering you again. With your latest commit, I am again getting the following:

FATAL ERROR : You didn't have a '-' in your '--ro' string. You need to supply the options exactly as you would with rsync, with leading '-' or '--'. Try again..

So it seems that the fix for the --ro fix got unfixed :)

— Reply to this email directly, view it on GitHub https://github.com/hjmangalam/parsyncfp2/issues/7#issuecomment-1478742722, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASF3Y7CMRCN4GBBNKV4I53W5I6LZANCNFSM6AAAAAAWAHVO4I . You are receiving this because you were mentioned.Message ID: @.***>

--

Harry Mangalam