Closed stefangweichinger closed 3 years ago
don't you already use _play_context elsewhere to get the remote user ?
Looks good to me. Just beware that
self._play_context
may have it's own issues we are not aware of since Ansible recommends usingself.get_option()
.
I only wrote the PR to make the patch merge-able. You decide which function/method to use here.
@dmp1ce @stefangweichinger @boucman I'm sorting through all the new PRs right now, and this one is the first one. Does this solve your original problem with the variables?
It probably does solve @stefangweichinger issue but it might have unintended consequences because get_option
is recommended over _play_context
.
I can't tell right now, the original bug in debops is https://github.com/debops/debops/issues/1731 and @boucman reported it. Could you test, @boucman ?
The tests included in PR #40 ought to verify if this fix is working or not.
both the fixes work but like @dmp1ce said it might have unintended consequences.
Ok, I need to be honest with myself: I will not find the time to test anything in the near futur.
I am reasonably confident this will fix my bug since that is almost how I did my workaround. Don't wait for me at this point, or we will block on this forever :(
thanks @boucman
Could/should we ask someone from the ansible-project to have a look and decide if get_option
or _play_context
?
If @AnvithLobo says that both ways works, and no one is sure which one is the correct way to solve this, plus there might be side effects, I'm very much in favor of the suggestion from @stefangweichinger to include external experience.
I'm going to merge whatever bugfix solves this problem, if someone can confirm that it's solved the "proper" way.
My vote is for #40 #41 and #42 . #42 is built on #41 and #40. I only broke them up into 3 PRs for clarity.
Yeah i would vote for @dmp1ce's PRs as we don't modify any other code how we are getting args other than just removing LXC version checking. Since anything above ubuntu 16.04 will have lxc version 2 or above installed by default i dont think this check is necessary any longer. But we do need LXD version checking in Tests which we can improve on once these PRs are merged.
I still don't get why you say that both ways work, only one is potentially correct, but yet this should be merged as it is without any confirmation.
For the version checks: I agree that the code for v1 should be removed, it's no longer supported. I tend to disagree to remove all the checks, just to potentially add this later. But I don't have strong feelings about this. However the tests should have a version check, and this can be added in the PR before merging it. This does not need another PR just to fix the PR which is missing that.
@dmp1ce could you point us to some docs or so pointing out why _play_context
is problematic?
I am trying to ask someone on IRC #ansible, but as mentioned, I am no competent programmer, so my understanding is limited.
I also agree on adding/fixing the tests within this PR, as @andreasscherbaum said.
I had a conversation in the #ansible room on IRC.
@tadeboro told me a few things, I quote some lines here (and he told me to point out that he is "NOT part of the core Ansible team, I am just a developer who wrote too much ansible-related code ;)" ):
tadeboro: sgw1: get_option is what you want here.
(11:40:31) sgw1: could you point me at the "why" ? I have to bring arguments there ;-) thanks
(11:41:42) tadeboro: sgw1: You can start with https://github.com/ansible/ansible/blob/2cbfd1e350cbe1ca195d33306b5a9628667ddda8/lib/ansible/plugins/connection/__init__.py#L66-L68 ;)
(11:42:31) sgw1: I will try to, thanks. We should get on with that PR and I thought I ask "upstream" here
(11:43:35) tadeboro: Play context is not something new code should use and should move to get_option/set_options functions.
(11:44:06) sgw1: I just have to find out why the "get_option" way didn't work in the first place, the original issue was https://github.com/andreasscherbaum/ansible-lxc-ssh/issues/37
(11:44:58) sgw1: ssh.py was used as kind of a template for lxc_ssh.py ... / https://github.com/andreasscherbaum/ansible-lxc-ssh/issues/37#issuecomment-859130298
(11:45:56) sgw1: we face some problems with the user context, that's why someone came up with using play context.
(11:46:13) tadeboro: sgw1: Ugh, the code for that connection plugin is really old and does not call set_options() anywhere.
(11:46:35) sgw1: tadeboro: you mean the code of ssh.py is really old?
(11:48:05) tadeboro: Actually, both ;) The ssh connection plugin is really old and the lxc-ssh probably copied over the old stuff ;)
(11:48:47) sgw1: Isn't ssh.py the default way of connecting and should therefore be up to date and modern?
(11:49:32) tadeboro: It is a default way and it works. But it has also been there almost from the start of the ansible, so it is definitelly not the most modern piece of code there is.
(11:49:56) tadeboro: A bit of "do not fix things that are not broken" at work here ;)
(11:51:53) sgw1: tadeboro: I see and understand. But the "copied code" fails for the lxc_ssh.py plugin, so we have to find a way somehow.
(11:53:15) tadeboro: If you do not intend to modernize the codebase, then using the _play_content is way to go. This does mean that you are adding some technical debt, but if migrating things is not an options, it will have to do.
As a pointer to some modern way he mentioned:
https://steampunk.si/blog/let-us-give-ansible-a-rest/
Maybe https://steampunk.si/blog/let-us-give-ansible-a-rest/ cames in handy? The connection plugin example there (https://github.com/xlab-steampunk/ansible-pms-plugins/blob/master/connection/connection_plugins/pms.py) only uses modern constructs and you will see that there is no play_context being used anywhere.
I will have a look at that in the afternoon.
I hope this helps to decide ....
thanks @tadeboro
EDIT: sorry for the cutNpaste, not all URLs clickable ... but the tl;dr is clear, I assume
I found https://github.com/ansible/ansible/pull/70438 upstream and in
I noticed that lxc_ssh.py
is missing that u-prefix. Maybe not relevant to the current issue, but worth the fix anyway, I assume.
EDIT: I had removed that "u" myself in 1bbdf7910b6a4b3eae50e2df0a1b07d31377dbb2 .. sorry
I still don't get why you say that both ways work, only one is potentially correct, but yet this should be merged as it is without any confirmation.
For the version checks: I agree that the code for v1 should be removed, it's no longer supported. I tend to disagree to remove >all the checks, just to potentially add this later. But I don't have strong feelings about this. However the tests should have a >version check, and this can be added in the PR before merging it. This does not need another PR just to fix the PR which is >missing that.
ah by both solution works I meant it solves the problem currently at hand. One is dirty fix to fix the current problem at hand without much thought into what other issues might cause in the future and the other Doesn't modify much of the code and also solves the problem. By the looks of it people in #Ansible
also think this is the right way to go.
I don't have much experience with Ansible. So i couldn't predict what troubles play_context
might cause hence i just suggested it as "it works for now"
Fix.
About the test yes. But i think it could be improved and 17 seperate checks might be a overkill IMO.
https://github.com/andreasscherbaum/ansible-lxc-ssh/issues/37#issuecomment-864069967
Like this user suggests in case we need version checking for future use we would probably need to do that after the __init__
So is this PR ready to be merged in? It evolved quite a bit, and in the end it's not very long now.
@stefangweichinger?
@andreasscherbaum
I have to look through all this once again, I don't have the full overview anymore right now. And I still don't fully understand some statements from others.
IMO it fixes the issue, keeps in the version check (which is nice to have just to be safe), and uses get_option
and not that problematic play context
. So to me it looks good. Although I am not a competent python coder at all.
I don't fully understand the tests and if there are false positives etc ... If there are, I think there should be a new and separate PR for that.
@stefangweichinger Ok, then I wait for your feedback.
Why me? ;-)
My branch https://github.com/stefangweichinger/ansible-lxc-ssh/commits/alltogether was a try to come up with a combination of more or less all the suggested PRs at that time. IMO it should contain a working py-script.
I used it in a local test setup and it was OK for me. I don't know about other issues. I asked a customer to use it in a rollout this week.
I'm as confused as you because of the many PRs and changes.
@andreasscherbaum jitsi in the next days?
@stefangweichinger Yes, please!
@andreasscherbaum write to me at office @ oops co at ... maybe tmrw
@dmp1ce @AnvithLobo We plan to have a Jitsi call early next week, and discuss which PRs to merge, in which order. If you plan to join, please drop me an email at
I can probably be available any day between 9am - 4pm EST. I couldn't find your email @andreasscherbaum
I am just going to suggest #40 should be merged first, because it implements valid tests. The current tests are showing false positives!
Next I will recommend #41 because it is the most simple change to get tests passing.
@dmp1ce My email address is:
my first name
@ my last name
dot
la
And I see what GitHub did with the < and > in the previous text...
Scheduled a meeting for Monday, 13:00 UTC (15:00 CEST). Will post the Jitsi link here shortly before the meeting. The people where I know the email address got a calendar invite.
The meeting link is: https://meet.jit.si/lxc-ssh-merging
Course of action (from our discussion):
Merge #40
Re-test #38, then merge it if it works Question: does #40 work with 38, and vice versa
Remove #41, #42
Re-check #39
Ah, doesn't work as expected, as re-running the tests will use the same GITHUB_SHA and GITHUB_REF.
@stefangweichinger Can you please pull the changes from #40 into this branch, rebase, and then push this again? This should trigger the tests with the new code from #40. Thanks!
@andreasscherbaum I cherry-picked 94d2d0c628306de52cf159d5edb335064ebda891 from #40 ... / should work, right?
@stefangweichinger If you rebase to HEAD, it should work. The previous commits are the badges you added.
@andreasscherbaum "git rebase HEAD"; git push origin master" (in my branch "remote_user") doesn't change anything anymore. So I think above state is it .. unless I misunderstand you.
@stefangweichinger Looks good, I think.
@dmp1ce Can you please look into the GH Actions tests and see if that is what you expect from your tests? Thanks!
Compare the test for this PR with the tests from the latest commit.
The tests in the latest commit fail at the "Run test" step with ":
TASK [Gathering Facts] *********************************************************
fatal: [testhost]: FAILED! => {"msg": "Failed to connect to the host via ssh: runner@10.242.67.247: Permission denied (publickey,password).\r\n"}
In this PR run test passes!
In this PR run test passes!
Isn't that ... good?
In this PR run test passes!
Isn't that ... good?
Yes! All good. This is the result I expected too. 🙂
Yes! All good. This is the result I expected too.
great. Sounds as if our plan works.
@dmp1ce Which means I can go ahead and merge this PR here?
Yes, it is fine. I had to double check locally because Github shows a much bigger diff than there actually is.
This is the actual diff from this PR.
diff --git a/lxc_ssh.py b/lxc_ssh.py
index 37a062f..04684e7 100644
--- a/lxc_ssh.py
+++ b/lxc_ssh.py
@@ -517,8 +517,8 @@ class Connection(ConnectionBase):
self.user = self._play_context.remote_user
self.control_path = None
self.control_path_dir = None
- self.lxc_version = None
+ def _set_version(self):
# LXC v1 uses 'lxc-info', 'lxc-attach' and so on
# LXC v2 uses just 'lxc'
(returncode2, stdout2, stderr2) = self._exec_command("which lxc", None, False)
@@ -535,6 +535,10 @@ class Connection(ConnectionBase):
raise AnsibleConnectionFailure("Cannot identify LXC version")
sys.exit(1)
+ def set_options(self, *args, **kwargs):
+ super(Connection, self).set_options(*args, **kwargs)
+ self._set_version()
+
# The connection is created by running ssh/scp/sftp from the exec_command,
# put_file, and fetch_file methods, so we don't need to do any connection
# management here.
@@ -690,7 +694,7 @@ class Connection(ConnectionBase):
to_bytes(a, errors="surrogate_or_strict")
for a in self._split_ssh_args(ssh_args)
]
- self._add_args(b_command, b_args, "ansible.cfg set ssh_args")
+ self._add_args(b_command, b_args, u"ansible.cfg set ssh_args")
# Now we add various arguments that have their own specific settings
# defined in docs above.
@@ -750,6 +754,7 @@ class Connection(ConnectionBase):
)
self.user = self.get_option("remote_user")
+
if self.user:
self._add_args(
b_command,
:tada:
suggested by @AnvithLobo in https://github.com/andreasscherbaum/ansible-lxc-ssh/issues/37