aws / aws-ec2-instance-connect-config

This is the ssh daemon configuration and necessary EC2 instance scripting to enable EC2 Instance Connect. Also included is various package manager configurations for packaging for various Linux distributions.
Apache License 2.0
83 stars 35 forks source link

ec2-instance-connect v1.1.12 broke our existing SSH Config #19

Open zscholl opened 4 years ago

zscholl commented 4 years ago

Issue Summary

When we upgraded the ec2-instance-connect package from 1.1.9 to 1.1.12 on Ubuntu 16.04 and 18.04, ec2-instance-connect overrode our AuthorizedKeysCommand and AuthorizedKeysCommandUser.

For our use case we want to support two sets of AuthorizedKeysCommand

  1. one for a directory backed system and
  2. one for EC2 instance connect when certain users are logging in

I believe it was this commit which corrected the systemd service file that introduced this issue for us.

Our config

Here is what our sshd_config looked like (some values replaced).

AuthorizedKeysCommand /usr/bin/<some_other_key_command>
AuthorizedKeysCommandUser nobody
AllowGroups <our list of groups that are allowed with some_other_key_command>
AuthenticationMethods publickey

Match Group ec2i
        AllowGroups <ec2i>
        AuthorizedKeysCommand /usr/share/ec2-instance-connect/eic_run_authorized_keys %u %f
        AuthorizedKeysCommandUser ec2-instance-connect

auth.log

When attempting our default login path (non-ec2i) we could see that sshd was attempting to use the ec2 instance connect commands to look up authorized keys.

2020-03-02T16:34:40.896366+00:00 ip-192-168-1-1  sshd[1234]: Connection closed by authenticating user my_user 1.1.1.1 port 63090 [preauth]
2020-03-02T16:35:37.982098+00:00 ip-192-168-1-1  sshd[4567]: AuthorizedKeysCommand /usr/share/ec2-instance-connect/eic_run_authorized_keys my_user SHA256:Ahje+...

Service details

$ sudo service sshd status
● ssh.service - OpenBSD Secure Shell server
   Loaded: loaded (/lib/systemd/system/ssh.service; enabled; vendor preset: enabled)
  Drop-In: /lib/systemd/system/ssh.service.d
           └─ec2-instance-connect.conf
   Active: active (running) since Mon 2020-03-02 20:51:48 UTC; 1min 13s ago
 Main PID: 1744 (sshd)
    Tasks: 9
   Memory: 141.8M
      CPU: 2.828s
   CGroup: /system.slice/ssh.service
           ├─1744 /usr/sbin/sshd -D -o AuthorizedKeysCommand /usr/share/ec2-instance-connect/eic_run_authorized_keys %u %f -o AuthorizedKeysCommandUser ec2-instance-con
         ....

Solution

We have downgraded and pinned version 1.1.9. This restored our ability to SSH via our directory backed AuthorizedKeysCommand.

Downgrade service details

$ sudo service sshd status
● ssh.service - OpenBSD Secure Shell server
   Loaded: loaded (/lib/systemd/system/ssh.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2020-03-02 17:43:31 UTC; 3s ago
  Process: 8539 ExecStartPre=/usr/sbin/sshd -t (code=exited, status=0/SUCCESS)
 Main PID: 8540 (sshd)
    Tasks: 8 (limit: 4915)
   CGroup: /system.slice/ssh.service
           ├─8540 /usr/sbin/sshd -D
       ....

Suggestions

  1. Do not pass in explicit config settings to sshd through the service config
  2. Write a test to ensure that any existing AuthorizedKeysCommand is respected.
  3. Don't attempt to make any changes to the AuthorizedKeysCommand on install. Instead print a message that tells users how to configure sshd_config for themselves.
LordAlfredo commented 4 years ago

Installation/configuration behavior was an explicit item discussed with Amazon Linux, Canonical, and Red Hat. The exact decision, as signed off by Amazon leadership and our partners, is that if during install EC2 Instance Connect does not detect that the instance already has a custom AuthorizedKeysCommand then it will attempt to modify the system sshd as part of the installation process. The exact mechanism, however, was disagreed upon by each distro vendor.

The detection mechanism is pretty straightforward: during post install scriptlets, check if /etc/ssh/sshd_config already has AuthorizedKeysCommand and AuthorizedKeysCommandUser either not present or as a default #AuthorizedKeysCommand none / #AuthorizedKeysCommandUser nobody

Since you have an AuthorizedKeysCommand configured the post-installation scriptlet should not have installed our override. Can you check if the file /lib/systemd/system/ssh.service.d/ec2-instance-connect.conf exists? If it does then something in the package install/upgrade logic failed. Deleting this file or uninstalling the ec2-instace-connect package will remove the override. If you still would like EIC supported alongside your own AuthorizedKeysCommand you will need to add a hook to our scriptlet somewhere in your own - see said override file for how to invoke.

As far as the upgrade itself, did you apply your custom AuthorizedKeysCommand before or after installing/upgrading EIC? Can you provide the apt-get output from when you both upgraded and downgraded?

Along the lines of your suggestion, there is already an explicit end-to-end test that should have captured this - the test specifically configures a custom AuthorizedKeysCommand in sshd_config, then installs EIC, and verifies that EIC did not apply the override. We have an explicit requirement that any proposed code change and package publication have full integration test output for all platforms attached as part of the approval process, so in theory any detection failure should have been caught prior to package release. If we can reproduce the circumstances where you triggered this gap in our detection I would like to refine this test to avoid a future regression.

zscholl commented 4 years ago

Thanks Daniel for the context around the decision on how you're configuring sshd. It does appear that there was something that failed in the post-installation scriplet for us.

We had an AuthorizedKeysCommand and AuthorizedKeysCommandUser set in sshd_config prior to installing ec2-instance-connect yet the ec2-instance-connect.conf file ended up in the systemd config directory for ssh.

user@ip-192-168-1-1:~$ cat /lib/systemd/system/ssh.service.d/ec2-instance-connect.conf 
[Service]
ExecStart=
ExecStart=/usr/sbin/sshd -D -o "AuthorizedKeysCommand /usr/share/ec2-instance-connect/eic_run_authorized_keys %%u %%f" -o "AuthorizedKeysCommandUser ec2-instance-connect" $SSHD_OPTS

Here's the output from installing ec2-instance-connect with apt install

Reading state information...
The following NEW packages will be installed:
  ec2-instance-connect
0 upgraded, 1 newly installed, 0 to remove and 3 not upgraded.
Need to get 12.5 kB of archives.
After this operation, 56.3 kB of additional disk space will be used.
Get:1 http://us-east-1.ec2.archive.ubuntu.com/ubuntu xenial-updates/universe amd64 ec2-instance-connect all 1.1.12+dfsg1-0ubuntu3~16.04.0 [12.5 kB]
Fetched 12.5 kB in 0s (609 kB/s)
Selecting previously unselected package ec2-instance-connect.
(Reading database ... 99552 files and directories currently installed.)
Preparing to unpack .../ec2-instance-connect_1.1.12+dfsg1-0ubuntu3~16.04.0_all.deb ...
Created system user ec2-instance-connect
Unpacking ec2-instance-connect (1.1.12+dfsg1-0ubuntu3~16.04.0) ...
Setting up ec2-instance-connect (1.1.12+dfsg1-0ubuntu3~16.04.0) ...
ERROR: Not restarting ssh because /etc/ssh/sshd_config already sets
ERROR: AuthorizedKeysCommand*, which is also set by
ERROR: /lib/systemd/system/ssh.service.d/ec2-instance-connect.conf.
Please restart ssh manually if the configuration is correct.

I was not able to downgrade via apt directly, the old package was removed from the ubuntu package repository so I downloaded the old .deb from launchpad.net and installed with dpkg. Here's that output:

Selecting previously unselected package ec2-instance-connect.
(Reading database ... 99552 files and directories currently installed.)
Preparing to unpack .../ec2-instance-connect_1.1.9-0ubuntu3~16.04.1_all.deb ...
Created system user ec2-instance-connect
Unpacking ec2-instance-connect (1.1.9-0ubuntu3~16.04.1) ...
Setting up ec2-instance-connect (1.1.9-0ubuntu3~16.04.1) ...

If I understand you correctly, you're suggesting that we just delete the ec2-instance.conf file after installation and that will allow us to run the latest version. This would solve our immediate problem, but my concern would be that at any time a user could run apt upgrade which pulls a new version of ec2-instance-connect that potentially locks them out of the system.

We could run a cron job to ensure that the file does not exist there, but ideally the package install works as intended and doesn't put the config file there if AuthorizedKeysCommand exists in sshd_config.

zscholl commented 4 years ago

It looks like the ec2-instance-connect.conf is being copied from the .deb package during install in v1.1.12 where it was not in v1.1.9.

I checked the creation time of the ec2-instance-connect.conf file on my system.

user@ip-192-168-1-1:~$ ls -lah /lib/systemd/system/ssh.service.d/
total 48K
drwxr-xr-x  2 root root 4.0K Mar  1 15:32 .
drwxr-xr-x 27 root root  36K Mar  1 15:32 ..
-rw-r--r--  1 root root  203 Feb 10 20:26 ec2-instance-connect.conf

This matches the creation time of the file in the .deb package.

user@ip-192-168-1-1:~$ dpkg-deb -c ec2-instance-connect_1.1.12+dfsg1-0ubuntu3~16.04.0_all.deb 
...
drwxr-xr-x root/root         0 2020-02-17 11:19 ./lib/systemd/
drwxr-xr-x root/root         0 2020-02-17 11:19 ./lib/systemd/system/
-rw-r--r-- root/root       244 2020-01-16 19:03 ./lib/systemd/system/ec2-instance-connect.service
drwxr-xr-x root/root         0 2020-02-17 11:19 ./lib/systemd/system/ssh.service.d/
-rw-r--r-- root/root       203 2020-02-10 20:26 ./lib/systemd/system/ssh.service.d/ec2-instance-connect.conf
...

This file is not in the v1.1.9 .deb package.

user@ip-192-168-1-1:~$ dpkg-deb -c ec2-instance-connect_1.1.9-0ubuntu3~16.04.1_all.deb 
drwxr-xr-x root/root         0 2019-06-27 19:44 ./
drwxr-xr-x root/root         0 2019-06-27 19:44 ./usr/
drwxr-xr-x root/root         0 2019-06-27 19:44 ./usr/share/
drwxr-xr-x root/root         0 2019-06-27 19:44 ./usr/share/doc/
drwxr-xr-x root/root         0 2019-06-27 19:44 ./usr/share/doc/ec2-instance-connect/
-rw-r--r-- root/root       930 2019-06-27 19:44 ./usr/share/doc/ec2-instance-connect/changelog.Debian.gz
-rw-r--r-- root/root       968 2019-05-15 19:38 ./usr/share/doc/ec2-instance-connect/copyright
drwxr-xr-x root/root         0 2019-06-27 19:44 ./usr/share/ec2-instance-connect/
-rwxr-xr-x root/root      5169 2019-05-15 19:38 ./usr/share/ec2-instance-connect/eic_curl_authorized_keys
-rwxr-xr-x root/root      5625 2019-05-15 19:38 ./usr/share/ec2-instance-connect/eic_harvest_hostkeys
-rwxr-xr-x root/root       810 2019-05-15 19:38 ./usr/share/ec2-instance-connect/eic_run_authorized_keys
-rwxr-xr-x root/root     15001 2019-05-15 19:38 ./usr/share/ec2-instance-connect/eic_parse_authorized_keys
drwxr-xr-x root/root         0 2019-06-27 19:44 ./lib/
drwxr-xr-x root/root         0 2019-06-27 19:44 ./lib/systemd/
drwxr-xr-x root/root         0 2019-06-27 19:44 ./lib/systemd/system/
-rw-r--r-- root/root       246 2019-05-15 19:38 ./lib/systemd/system/ec2-instance-connect.service

I am no expert in creating debian packages, but I think that adding the systemd helper may have inadvertently caused this.

Is it possible the systemd helper is copying over every file in lib/systemd to systemd ?

ferricoxide commented 4 years ago

@LordAlfredo per the statement:

RHEL+CentOS is still under discussion (see #2, we're still discussing implementation detail with Red Hat)

Is there a public BugZilla for this (I did a search of what seemed to be sensible terms but got null search-results)? Now that I'm authoring my customers' AMI-creation automation for RHEL 8 and CentOS 8, I'd love to be able to track the status of any issues the relevant OS vendors might have in their bug-trackers.

cregkly commented 4 years ago

The exact decision, as signed off by Amazon leadership and our partners, is that if during install EC2 Instance Connect does not detect that the instance already has a custom AuthorizedKeysCommand then it will attempt to modify the system sshd as part of the installation process.

This fails where an AMI already has the broken package installed and the change to AuthorizedKeysCommand happens after. But I guess this is just capturing people who are upgrading.

Canonical disagreed with modifying another package's file and instead we agreed to implement a systemd override file

This is the wrong place to put the override. It is not immediately obvious that the sshd_config is not being used. Ubuntu already has /etc/ssh/sshd_config.d/ for configuring sshd overrides.

Deleting this file or uninstalling the ec2-instance-connect package will remove the override.

What do we lose by doing this?

mbainter commented 3 years ago

I agree with cregkly here. I just got burned by this in a similar fashion (where it's already installed in the AMI before modification) and it is unfathomable to me that Canonical thought it was a good idea to implement it this way, and even more so that AWS found it to be an agreeable solution.

mechastorm commented 3 years ago

I would to add that removing the ec2-instance-connect package will fix this issue for now.

I have done some testing because I also wanted to know

Deleting this file or uninstalling the ec2-instance-connect package will remove the override. What do we lose by doing this?

At least from my test

But if someone else from AWS would like to confirm that would good as well.

kylet21 commented 1 year ago

Sorry to resurrect this issue, but I think this is still causing an issue for my team and me. We are running Ubuntu 20.04LTS as an AWS base image that already has ec2-instance-connect installed and provisioning using that. However, we have a custom AuthorizedKeysCommand that we add in our provisioning steps.

The issue we are facing happens when we uninstall ec2-instance-connect. We are using Packer, and the uninstall of ec2-instance-connect causing an issue with restarting sshd intermittently, causing some nightly builds to fail. Is there no other, better way to deal with this issue?