Open zscholl opened 4 years ago
Installation/configuration behavior was an explicit item discussed with Amazon Linux, Canonical, and Red Hat. The exact decision, as signed off by Amazon leadership and our partners, is that if during install EC2 Instance Connect does not detect that the instance already has a custom AuthorizedKeysCommand then it will attempt to modify the system sshd as part of the installation process. The exact mechanism, however, was disagreed upon by each distro vendor.
The detection mechanism is pretty straightforward: during post install scriptlets, check if /etc/ssh/sshd_config already has AuthorizedKeysCommand and AuthorizedKeysCommandUser either not present or as a default #AuthorizedKeysCommand none
/ #AuthorizedKeysCommandUser nobody
Since you have an AuthorizedKeysCommand configured the post-installation scriptlet should not have installed our override. Can you check if the file /lib/systemd/system/ssh.service.d/ec2-instance-connect.conf
exists? If it does then something in the package install/upgrade logic failed.
Deleting this file or uninstalling the ec2-instace-connect
package will remove the override.
If you still would like EIC supported alongside your own AuthorizedKeysCommand you will need to add a hook to our scriptlet somewhere in your own - see said override file for how to invoke.
As far as the upgrade itself, did you apply your custom AuthorizedKeysCommand before or after installing/upgrading EIC? Can you provide the apt-get output from when you both upgraded and downgraded?
Along the lines of your suggestion, there is already an explicit end-to-end test that should have captured this - the test specifically configures a custom AuthorizedKeysCommand in sshd_config, then installs EIC, and verifies that EIC did not apply the override. We have an explicit requirement that any proposed code change and package publication have full integration test output for all platforms attached as part of the approval process, so in theory any detection failure should have been caught prior to package release. If we can reproduce the circumstances where you triggered this gap in our detection I would like to refine this test to avoid a future regression.
Thanks Daniel for the context around the decision on how you're configuring sshd. It does appear that there was something that failed in the post-installation scriplet for us.
We had an AuthorizedKeysCommand
and AuthorizedKeysCommandUser
set in sshd_config
prior to installing ec2-instance-connect
yet the ec2-instance-connect.conf
file ended up in the systemd config directory for ssh.
user@ip-192-168-1-1:~$ cat /lib/systemd/system/ssh.service.d/ec2-instance-connect.conf
[Service]
ExecStart=
ExecStart=/usr/sbin/sshd -D -o "AuthorizedKeysCommand /usr/share/ec2-instance-connect/eic_run_authorized_keys %%u %%f" -o "AuthorizedKeysCommandUser ec2-instance-connect" $SSHD_OPTS
Here's the output from installing ec2-instance-connect
with apt install
Reading state information...
The following NEW packages will be installed:
ec2-instance-connect
0 upgraded, 1 newly installed, 0 to remove and 3 not upgraded.
Need to get 12.5 kB of archives.
After this operation, 56.3 kB of additional disk space will be used.
Get:1 http://us-east-1.ec2.archive.ubuntu.com/ubuntu xenial-updates/universe amd64 ec2-instance-connect all 1.1.12+dfsg1-0ubuntu3~16.04.0 [12.5 kB]
Fetched 12.5 kB in 0s (609 kB/s)
Selecting previously unselected package ec2-instance-connect.
(Reading database ... 99552 files and directories currently installed.)
Preparing to unpack .../ec2-instance-connect_1.1.12+dfsg1-0ubuntu3~16.04.0_all.deb ...
Created system user ec2-instance-connect
Unpacking ec2-instance-connect (1.1.12+dfsg1-0ubuntu3~16.04.0) ...
Setting up ec2-instance-connect (1.1.12+dfsg1-0ubuntu3~16.04.0) ...
ERROR: Not restarting ssh because /etc/ssh/sshd_config already sets
ERROR: AuthorizedKeysCommand*, which is also set by
ERROR: /lib/systemd/system/ssh.service.d/ec2-instance-connect.conf.
Please restart ssh manually if the configuration is correct.
I was not able to downgrade via apt
directly, the old package was removed from the ubuntu package repository so I downloaded the old .deb
from launchpad.net and installed with dpkg
. Here's that output:
Selecting previously unselected package ec2-instance-connect.
(Reading database ... 99552 files and directories currently installed.)
Preparing to unpack .../ec2-instance-connect_1.1.9-0ubuntu3~16.04.1_all.deb ...
Created system user ec2-instance-connect
Unpacking ec2-instance-connect (1.1.9-0ubuntu3~16.04.1) ...
Setting up ec2-instance-connect (1.1.9-0ubuntu3~16.04.1) ...
If I understand you correctly, you're suggesting that we just delete the ec2-instance.conf
file after installation and that will allow us to run the latest version. This would solve our immediate problem, but my concern would be that at any time a user could run apt upgrade
which pulls a new version of ec2-instance-connect
that potentially locks them out of the system.
We could run a cron job to ensure that the file does not exist there, but ideally the package install works as intended and doesn't put the config file there if AuthorizedKeysCommand
exists in sshd_config
.
It looks like the ec2-instance-connect.conf
is being copied from the .deb
package during install in v1.1.12 where it was not in v1.1.9.
I checked the creation time of the ec2-instance-connect.conf
file on my system.
user@ip-192-168-1-1:~$ ls -lah /lib/systemd/system/ssh.service.d/
total 48K
drwxr-xr-x 2 root root 4.0K Mar 1 15:32 .
drwxr-xr-x 27 root root 36K Mar 1 15:32 ..
-rw-r--r-- 1 root root 203 Feb 10 20:26 ec2-instance-connect.conf
This matches the creation time of the file in the .deb
package.
user@ip-192-168-1-1:~$ dpkg-deb -c ec2-instance-connect_1.1.12+dfsg1-0ubuntu3~16.04.0_all.deb
...
drwxr-xr-x root/root 0 2020-02-17 11:19 ./lib/systemd/
drwxr-xr-x root/root 0 2020-02-17 11:19 ./lib/systemd/system/
-rw-r--r-- root/root 244 2020-01-16 19:03 ./lib/systemd/system/ec2-instance-connect.service
drwxr-xr-x root/root 0 2020-02-17 11:19 ./lib/systemd/system/ssh.service.d/
-rw-r--r-- root/root 203 2020-02-10 20:26 ./lib/systemd/system/ssh.service.d/ec2-instance-connect.conf
...
This file is not in the v1.1.9 .deb
package.
user@ip-192-168-1-1:~$ dpkg-deb -c ec2-instance-connect_1.1.9-0ubuntu3~16.04.1_all.deb
drwxr-xr-x root/root 0 2019-06-27 19:44 ./
drwxr-xr-x root/root 0 2019-06-27 19:44 ./usr/
drwxr-xr-x root/root 0 2019-06-27 19:44 ./usr/share/
drwxr-xr-x root/root 0 2019-06-27 19:44 ./usr/share/doc/
drwxr-xr-x root/root 0 2019-06-27 19:44 ./usr/share/doc/ec2-instance-connect/
-rw-r--r-- root/root 930 2019-06-27 19:44 ./usr/share/doc/ec2-instance-connect/changelog.Debian.gz
-rw-r--r-- root/root 968 2019-05-15 19:38 ./usr/share/doc/ec2-instance-connect/copyright
drwxr-xr-x root/root 0 2019-06-27 19:44 ./usr/share/ec2-instance-connect/
-rwxr-xr-x root/root 5169 2019-05-15 19:38 ./usr/share/ec2-instance-connect/eic_curl_authorized_keys
-rwxr-xr-x root/root 5625 2019-05-15 19:38 ./usr/share/ec2-instance-connect/eic_harvest_hostkeys
-rwxr-xr-x root/root 810 2019-05-15 19:38 ./usr/share/ec2-instance-connect/eic_run_authorized_keys
-rwxr-xr-x root/root 15001 2019-05-15 19:38 ./usr/share/ec2-instance-connect/eic_parse_authorized_keys
drwxr-xr-x root/root 0 2019-06-27 19:44 ./lib/
drwxr-xr-x root/root 0 2019-06-27 19:44 ./lib/systemd/
drwxr-xr-x root/root 0 2019-06-27 19:44 ./lib/systemd/system/
-rw-r--r-- root/root 246 2019-05-15 19:38 ./lib/systemd/system/ec2-instance-connect.service
I am no expert in creating debian packages, but I think that adding the systemd helper may have inadvertently caused this.
Is it possible the systemd helper is copying over every file in lib/systemd
to systemd
?
@LordAlfredo per the statement:
RHEL+CentOS is still under discussion (see #2, we're still discussing implementation detail with Red Hat)
Is there a public BugZilla for this (I did a search of what seemed to be sensible terms but got null search-results)? Now that I'm authoring my customers' AMI-creation automation for RHEL 8 and CentOS 8, I'd love to be able to track the status of any issues the relevant OS vendors might have in their bug-trackers.
The exact decision, as signed off by Amazon leadership and our partners, is that if during install EC2 Instance Connect does not detect that the instance already has a custom AuthorizedKeysCommand then it will attempt to modify the system sshd as part of the installation process.
This fails where an AMI already has the broken package installed and the change to AuthorizedKeysCommand happens after. But I guess this is just capturing people who are upgrading.
Canonical disagreed with modifying another package's file and instead we agreed to implement a systemd override file
This is the wrong place to put the override. It is not immediately obvious that the sshd_config is not being used. Ubuntu already has /etc/ssh/sshd_config.d/ for configuring sshd overrides.
Deleting this file or uninstalling the ec2-instance-connect package will remove the override.
What do we lose by doing this?
I agree with cregkly here. I just got burned by this in a similar fashion (where it's already installed in the AMI before modification) and it is unfathomable to me that Canonical thought it was a good idea to implement it this way, and even more so that AWS found it to be an agreeable solution.
I would to add that removing the ec2-instance-connect
package will fix this issue for now.
I have done some testing because I also wanted to know
Deleting this file or uninstalling the ec2-instance-connect package will remove the override. What do we lose by doing this?
At least from my test
But if someone else from AWS would like to confirm that would good as well.
Sorry to resurrect this issue, but I think this is still causing an issue for my team and me. We are running Ubuntu 20.04LTS as an AWS base image that already has ec2-instance-connect installed and provisioning using that. However, we have a custom AuthorizedKeysCommand that we add in our provisioning steps.
The issue we are facing happens when we uninstall ec2-instance-connect. We are using Packer, and the uninstall of ec2-instance-connect causing an issue with restarting sshd intermittently, causing some nightly builds to fail. Is there no other, better way to deal with this issue?
Issue Summary
When we upgraded the
ec2-instance-connect
package from 1.1.9 to 1.1.12 on Ubuntu 16.04 and 18.04, ec2-instance-connect overrode ourAuthorizedKeysCommand
andAuthorizedKeysCommandUser
.For our use case we want to support two sets of
AuthorizedKeysCommand
I believe it was this commit which corrected the systemd service file that introduced this issue for us.
Our config
Here is what our
sshd_config
looked like (some values replaced).auth.log
When attempting our default login path (non-ec2i) we could see that sshd was attempting to use the ec2 instance connect commands to look up authorized keys.
Service details
Solution
We have downgraded and pinned version 1.1.9. This restored our ability to SSH via our directory backed
AuthorizedKeysCommand
.Downgrade service details
Suggestions
sshd
through the service configAuthorizedKeysCommand
is respected.AuthorizedKeysCommand
on install. Instead print a message that tells users how to configuresshd_config
for themselves.