pssh / parallel-ssh

PSSH provides parallel versions of OpenSSH and related tools. Included are pssh, pscp, prsync, pnuke, and pslurp. The project includes psshlib which can be used within custom applications.
Other
7 stars 1 forks source link

the "-h" and "-H" options don't override the PSSH_HOSTS environment variable #18

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1.as user lambda, do "pssh -i -H host1 date"
2.as root, do ""pssh -i -H host1 date"

What is the expected output? What do you see instead?
What is expected as user and root is:
$ pssh -i -H host1 date
[1] 16:42:31 [SUCCESS] host1
Thu Mar  4 16:37:31 CET 2010

Instead, as root, you get:
# pssh -i -H host1 date
[1] 16:37:31 [SUCCESS] host1
Thu Mar  4 16:37:31 CET 2010
[2] 16:37:31 [SUCCESS] host2
Thu Mar  4 16:37:31 CET 2010
[3] 16:37:31 [SUCCESS] host3
Thu Mar  4 16:37:31 CET 2010
[4] 16:37:31 [SUCCESS] host4
Thu Mar  4 16:37:31 CET 2010
[5] 16:37:31 [SUCCESS] host5
Thu Mar  4 16:37:31 CET 2010
[6] 16:37:31 [SUCCESS] host6
Thu Mar  4 16:37:31 CET 2010
[7] 16:37:31 [SUCCESS] host7

All hosts listed in your host file will display the date although I specified 
the option -H.

What version of the product are you using? On what operating system?
pssh-2.1-1.fc12.noarch on fedora 12.

Original issue reported on code.google.com by raoul.be...@gmail.com on 4 Mar 2010 at 3:49

GoogleCodeExporter commented 9 years ago
The "-H" file adds hosts (allowing you to specify both "-h hosts.txt" and "-H 
host42" 
or even to specify "-H" multiple times).  Is there any chance that the root 
user has a 
value set for the PSSH_HOSTS environment variable?  If there were a hosts file 
specified in PSSH_HOSTS, then pssh would use both that file and the "-H" option.

Original comment by amcna...@gmail.com on 4 Mar 2010 at 6:56

GoogleCodeExporter commented 9 years ago
Yes, root has this PSSH_HOSTS environment variable set to some file but when 
you use 
option -H, pssh should bypass that PSSH_HOSTS environment variable.
If host1 is in that PSSH_HOSTS file, then "pssh -i -H host1 date" will print 
the date 
twice for host1! Only host1 should print the date.

Original comment by raoul.be...@gmail.com on 4 Mar 2010 at 7:38

GoogleCodeExporter commented 9 years ago
I respectfully disagree.  I frequently use the "-h" option multiple times or 
combine it with the "-
H" option, and the PSSH_HOSTS environment variable is the same as passing a 
"-h" option.  In fact, 
most of the time I use pssh, I need to have the command run 4 times per host 
(to effectively 
utilize 4 core processors).

Would it be practical to run:

PSSH_HOSTS="" pssh -i -H host1 date

when you want to use the "-H" option instead of using PSSH_HOSTS?  
Alternatively, would it be 
practical to not use the PSSH_HOSTS environment variable and to just explicitly 
specify "-h" when 
you want to use the hosts file and leave it off when you don't want to use it?

I'm worried that although overriding might be more helpful for your use case, 
it might be less 
intuitive for other users.  I would love to hear any additional thoughts that 
you have.  Thanks for 
your help.

Original comment by amcna...@gmail.com on 4 Mar 2010 at 7:51

GoogleCodeExporter commented 9 years ago

Original comment by amcna...@gmail.com on 4 Mar 2010 at 7:51

GoogleCodeExporter commented 9 years ago
When I first added the -H option last year, it was to bypass the PSSH_HOSTS 
environment variable because I wanted to run a command on several hosts 
but not all.
So
pssh -i -H host1 -H host2 "some_command"
would run some_command on host1 and host2.
This seems more logic to me than if it runs some_command on host1, host2 and 
all on the hosts listed in the PSSH_HOSTS file where host1 and host2 
will probably be listed as well.

I don't think users will think of adding PSSH_HOSTS="" before the pssh command 
and then you might run a command on all your hosts defined in the 
PSSH_HOSTS file, a command that you actually want to run only on some hosts and 
not on some others (am I clear here?). That can be dangerous.

If you want to run a command n times on the same host, then we should think of 
an option like "-n 4" to actually run a command on a host using 4 
core processors.
So
pss -i -H host1 -H host2 -n 4 "some_command"
would run some_command on 4 cores on host1 and host2 only.

Original comment by raoul.be...@gmail.com on 4 Mar 2010 at 9:54

GoogleCodeExporter commented 9 years ago
I think it's critical that the "-h" and "-H" options allow hosts to be 
specified multiple 
times.  An option like "-n 4" wouldn't work because it doesn't allow the number 
of 
connections to vary on a host-by-host basis.  For example, I can do:

pssh -h quad_core_hosts -h quad_core_hosts -h quad_core_hosts -h 
quad_core_hosts -h 
dual_core hosts -h dual_core_hosts some_command

To be honest, I've never actually used the PSSH_HOSTS environment variable.  I 
like setting 
things like the timeout in an environment variable, but I don't like dealing 
with the hosts 
file this way (for some of the reasons you describe).  Out of curiosity, what 
is the benefit 
of using PSSH_HOSTS instead of just passing a "-h" option?

Since I don't really use it anyway, I might be willing to change my mind on the 
issue of 
overriding the PSSH_HOSTS environment variable.  However, I think it would be 
important to 
get feedback from other users to find out what really seems to be the most 
intuitive.  It 
would be particularly helpful to get feedback from other people that use the 
PSSH_HOSTS 
environment variable.  I'll post an email to the mailing list to try to get 
others to share 
their opinions.

Original comment by amcna...@gmail.com on 4 Mar 2010 at 10:40

GoogleCodeExporter commented 9 years ago
Well, the good point of using PSSH_HOSTS is that you don't need to type the 
hosts on 
the command line, especially if you have hundreds of hosts.

Any user of pssh would want it to work according to his needs.
So I think that we should think of all possible use of pssh and add necessary 
options.

If you want to specify the number of cores on host-by-host basis, we could 
think 
something like:
pssh -i -H [user@]host1[:[port]:n] -h quad_core_hosts[:n] 
Where n would be the number of core processors.

Original comment by raoul.be...@gmail.com on 4 Mar 2010 at 11:02

GoogleCodeExporter commented 9 years ago
When I have hundreds of hosts, I use the "-h" option to specify a hosts file 
instead 
of putting the filename in PSSH_HOSTS.  Out of curiosity, what's the use case 
for 
using the environment variable instead of the "-h" option to specify the 
filename?

Are you trying to argue that the "-h" and "-H" options shouldn't be able to 
specify 
hosts multiple times?  I thought that your main point was that PSSH_HOSTS 
should be 
ignored if either "-h" or "-H" is specified.  I would prefer to focus on one 
issue at 
a time.

Original comment by amcna...@gmail.com on 4 Mar 2010 at 11:08

GoogleCodeExporter commented 9 years ago
OK, let me be clear:

1- Should -H used alone override PSSH_HOSTS environment variable or not? I 
think yes.
2- Should -h used alone override PSSH_HOSTS environment variable or not? I 
think yes.
3- Can -H and -h be used together or not? I think yes.
4- Should -H used together with -h override PSSH_HOSTS environment variable or 
not? I think 
yes.
5- If none of these 2 options are used, then PSSH_HOSTS environment variable 
will be used.
6- If none of these 2 options are used and PSSH_HOSTS environment variable is 
undefined, 
then print an error message.

What do you think?

Original comment by raoul.be...@gmail.com on 5 Mar 2010 at 9:38

GoogleCodeExporter commented 9 years ago
Thank you for posting this list.  I think it helps clarify the discussion.  The 
later 
items follow from the first one (items 2 and 4 should match 1).  So the 
question comes 
down to: should 1 be yes or should 1 be no?

The current behavior of 1 is no.  Since I don't use the PSSH_HOSTS environment 
variable, I don't have much intuition about the matter.  I could be persuaded 
to 
change 1 to yes, but I would really feel more comfortable if we could get 
feedback 
from a few other people who use PSSH_HOSTS.

Original comment by amcna...@gmail.com on 5 Mar 2010 at 5:33

GoogleCodeExporter commented 9 years ago
But if you don't use the PSSH_HOSTS environment variable, then 1 can be 'yes' 
and it 
won't change anything for you.

Original comment by raoul.be...@gmail.com on 5 Mar 2010 at 5:54

GoogleCodeExporter commented 9 years ago
Thank you for posting this list.  I think it helps clarify the discussion.  The 
later 
items follow from the first one (items 2 and 4 should match 1).  So the 
question comes 
down to: should 1 be yes or should 1 be no?

The current behavior of 1 is no.  Since I don't use the PSSH_HOSTS environment 
variable, I don't have much intuition about the matter.  I could be persuaded 
to 
change 1 to yes, but I would really feel more comfortable if we could get 
feedback 
from a few other people who use PSSH_HOSTS.

Original comment by amcna...@gmail.com on 5 Mar 2010 at 5:59

GoogleCodeExporter commented 9 years ago
Sorry for the double-post; I was on a broken wireless connection.

With respect to comment #11, it might not change things for me, but it might 
change 
things for other people.  If it were just about me, I would remove PSSH_HOSTS 
entirely, since I think it just adds complexity and confusion.  By the way, is 
there 
any particular reason you can't just use the "-h" option, perhaps with aliases 
as 
suggested by Jan on the mailing list?

Original comment by amcna...@gmail.com on 5 Mar 2010 at 6:14

GoogleCodeExporter commented 9 years ago
raul.beauduin, I haven't heard from you in a while, but I don't want the issue 
to fall 
through the cracks.  I really would like to come up with a good resolution to 
this.  
Have you had a chance to see Jan's suggestion of creating an alias using the 
"-h" 
option?  I would love to hear a little more about how the PSSH_HOSTS 
environment 
variable has been helpful for you and to know whether that need could be served 
as 
well with the "-h" option.

Original comment by amcna...@gmail.com on 15 Mar 2010 at 5:21

GoogleCodeExporter commented 9 years ago
The problem is that when I installed the last version of pssh, I had this 
PSSH_HOSTS 
environment variable already set and I ran the pssh command wih the -h option 
thinking that it would run pssh only on the host specified with that -h option. 
Instead it ran pssh on all the hosts specified in the PSSH_HOSTS environment 
variable 
and twice on the host specified with -h option because it was also in the 
PSSH_HOSTS 
file. Fortunately, the command I ran through pssh was not harmful.
Let me remind you that the previous version of pssh did not have this -h 
option. So 
when I ran pssh with -h, I got not warning.
So I really think -h should bypass PSSH_HOSTS variable.
The PSSH_HOSTS variable is helpful to me because I have many identical machines 
on 
which I run the same commands. And sometimes I run pssh only on several 
machines.

To solve this, we can keep -h doing what it does today and maybe add another 
option 
(-x) that would bypass PSSH_HOSTS variable.

Original comment by raoul.be...@gmail.com on 15 Mar 2010 at 5:50

GoogleCodeExporter commented 9 years ago
One of the things I don't like about PSSH_HOSTS is that it makes it easy to 
have difficult 
problems like the one that you experienced.  The "-h" option existed in earlier 
versions, but the 
behavior changed when "-h" grew the ability to be specified multiple times (to 
specify more than 
one hosts file).

Jan made the suggestion on the mailing list of creating aliases for each of the 
commonly-used 
hosts:

# Connecting to hosts A, B, C, D
alias pssh4='pssh -h /path/to/hosts1'
# Connecting to hosts B, C
alias pssh2='pssh -h /path/to/hosts2'

This seems like it would be more clear and less error-prone than using the 
PSSH_HOSTS variable.  
Would this work for you?  My intuition is that this would be even better for 
you than PSSH_HOSTS.

I think you've convinced me that the current behavior can be confusing.  In 
fact, I'm really 
starting to think that there's no way to do PSSH_HOSTS without it being 
confusing.  If you like 
the aliases technique, then I would strongly consider deprecating the 
PSSH_HOSTS environment 
variable.  If not, then I think it would at least make sense to make it an 
error to specify "-h" 
or "-H" when the PSSH_HOSTS environment variable is set.  However, I'm really 
inclined to get rid 
of the environment variable completely if you don't have objections (my 
intuition is that you 
would find Jan's aliases approach to be much more appealing).

What are your thoughts?  Thanks again for your participation; you've been very 
helpful.

Original comment by amcna...@gmail.com on 15 Mar 2010 at 7:38

GoogleCodeExporter commented 9 years ago
Ok then, kick it out!

Original comment by raoul.be...@gmail.com on 15 Mar 2010 at 10:21

GoogleCodeExporter commented 9 years ago
If you're sure you're happy, I'll do it.

Original comment by amcna...@gmail.com on 15 Mar 2010 at 11:16

GoogleCodeExporter commented 9 years ago
I'm sure.

Original comment by raoul.be...@gmail.com on 15 Mar 2010 at 11:18

GoogleCodeExporter commented 9 years ago
I'm a bit late to the party, but FWIW, idiomatic behaviour of most *IX tools is 
that 
the env vars establish default behaviour in the absence of command-line 
options, and 
that the command-line always overrides the env vars.
There are plenty of exceptions, however: ssh(1) itself is the first one that 
comes to 
mind, IIRC it doesn't handle env vars 100% consistently - some get merged, some 
get 
overridden, some get appended to.
My expectation from 20 years' use of various UNIXen is that cmdline overrides 
env, 
unless otherwise documented - but as long as it's documented, anything goes.

Original comment by athomps...@gmail.com on 19 Apr 2010 at 6:24

GoogleCodeExporter commented 9 years ago
Sorry for not commenting on this issue for a while.  Back in March I added a 
commit to deprecate PSSH_HOSTS, but I haven't commented on it here.  Anyway, 
this should take effect with the next release, and although PSSH_HOSTS will 
still work, it will report a warning to the user.

athompso99, thanks for your comments.  It's true that documentation makes all 
the difference.  In this particular case, I'm not sure it's worth the 
confusion, even with documentation. :)

Anyway, I'll mark this issue as started, and after a few releases we can 
actually remove PSSH_HOSTS.

Original comment by amcna...@gmail.com on 10 Jan 2011 at 2:42