sshd checks sometimes fail for reasons other than the rule

bernstei commented 1 year ago

On a cleanly installed macOS 12 machine tests that use /usr/sbin/sshd -T fail, and the reason appears to be that sshd isn't properly configured or something. Rather than giving any configuration information that can be checked for the correct values, that command returns

> sudo /usr/sbin/sshd -T
sshd: no hostkeys available -- exiting.
>

The machine is stock, not supposed to be (and is in fact not) running sshd, so nothing related to it has been configured.

Here's one example of such a rule, but 7 are failing for (I believe) the same reason. https://github.com/usnistgov/macos_security/blob/83f1c21b685d6f2daa4743169b617af5988dcd9a/rules/os/os_sshd_client_alive_count_max_configure.yaml#L8

robertgendler commented 1 year ago

Unfortunately if remote login is never enabled, then the sshd -T will fail with this error. It has to generate the hotkeys once.

The sshd -T is our best method to check the configuration

georgalis commented 1 year ago

Pardon me for jumping in without fully understanding the context of this issue. @bernstei are you asking or telling? It would seem to me /usr/sbin/sshd -T is operating properly. Presumably you are reporting a false negative test, because the check phase is missing an "If SSHD is enabled" prerequisite. Perhaps "ps ax | grep 'sshd$" is a valid pre-qualifying test?

robertgendler commented 1 year ago

@georgalis not bad of an idea to add some kind of check like that.

georgalis commented 1 year ago

@robertgendler if you can wait half dozen hours, I'll try my hand in this repo, and craft a PR this evening...

georgalis commented 1 year ago

...

Here's one example of such a rule, but 7 are failing for (I believe) the same reason.

@bernstei if you forward me the 7, I will include them in my patch...

robertgendler commented 1 year ago

@georgalis the check to do is not fail potentially if the hostkey isn't found. Because checking for sshd running then only checks if sshd is running.

georgalis commented 1 year ago

@georgalis the check to do is not fail potentially if the hostkey isn't found. Because checking for sshd running then only checks if sshd is running.

@robertgendler okay, good. that is a lot simpler than the rabbit hole of FIPS.140-3 not defining "SSHD" ie is it /usr/bin/sshd, the port, the protocol, something else, etc.

bernstei commented 1 year ago

I'll note the 7 next time I'm in the office. Note that these are not necessarily that those are all of the ones that could fail this way - I only know about the 7 that are included in the STIG baseline, which is what I was applying.

bernstei commented 1 year ago

I see this issue for these 7 items

os_sshd_client_alive_count_max_configure
os_sshd_client_alive_interval_configure
os_sshd_fips_140_ciphers
os_ssh_fips_140_macs
os_sshd_key_exchange_algorithm_configure
os_sshd_login_grace_time_configute
os_ssh_permit_root_login_configure

georgalis commented 1 year ago

@bernstei thanks, patch #248 should cover ossshd rules, @robertgendler I'll craft a separate PR for the osssh rules, this evening.

georgalis commented 1 year ago

Hi @bernstei I do not see os_ssh_fips_140_macs or os_ssh_permit_root_login_configure in the repo, where are they from? Did not check the other names but I presume they are corrected in #248 because I looked at rules/os/os_sshd_* in my revision?

bernstei commented 1 year ago

Good question - let me check.

[edited]

Those are typos, because it wasn't easy to cut and paste. I dropped off the "d" from sshd, i.e. they are actually os_sshd_fips_140_macs and os_sshd_permit_root_login_configure. Sorry for the confusion.

bernstei commented 1 year ago

Also, FWIW, I started "remote login" from the sharing control panel and then stopped it, and sshd -T still says "No hostkeys available", so it having been configured in the past isn't sufficient apparently. In fact, even running it while sshd is running gives the same error message - I don't know what's different about our setup, but it's not a robust way of getting the current sshd configuration.

georgalis commented 1 year ago

@bernstei that is not consistent with my experience with sshd and/or mac. I could speculate causes, but that would be something of a random walk. I'm certain some aspect of your system has deviated from a vendor baseline. Have you actually connected by ssh after enabling "remote login" from the sharing control panel? I do not know what triggers host key generation on mac, but the process has been automatic in my experience, they are necessary for the protocol, the OS generally runs key generation commands exactly once, and it would be highly unusual for them to ever be removed. The checks do indeed seem very effective at precisely determining the runtime configuration of /usr/sbin/sshd so, there must be some other factors involved?

robertgendler commented 1 year ago

Since opening this issue and pull request we had 2 thoughts on how to handle the check

if sshd -T 2> /dev/null; then
....check...
else
...could give a pass...
fi

in the compliance script, we do a check to see if there's any keys in /etc/ssh - if not, generate them using ssh-keygen -q -N "" -t rsa -b 4096 -f /etc/ssh/ssh_host_rsa_key - track that we generated keys, then delete the keys post script run. This would allow sshd -T to run. It would not enable ssh access.

Thoughts on the 2 methods @georgalis and @bernstei ?

bernstei commented 1 year ago

@bernstei that is not consistent with my experience with sshd and/or mac. I could speculate causes, but that would be something of a random walk. I'm certain some aspect of your system has deviated from a vendor baseline. Have you actually connected by ssh after enabling "remote login" from the sharing control panel? I do not know what triggers host key generation on mac, but the process has been automatic in my experience, they are necessary for the protocol, the OS generally runs key generation commands exactly once, and it would be highly unusual for them to ever be removed. The checks do indeed seem very effective at precisely determining the runtime configuration of /usr/sbin/sshd so, there must be some other factors involved?

I have done essentially no change from Apple's OS as freshly reinstalled except running the remediation script (and installing the corresponding mobileconfigs) for the Monterey STIG (basically only some additional user software installed, and a firmware password I guess). I didn't get as far as a successful ssh login, but I did initiate a (local) ssh far enough to get the login prompt and see ssh in the output of ps, and then tried to run sshd -T.

bernstei commented 1 year ago

in the compliance script, we do a check to see if there's any keys in /etc/ssh - if not, generate them using ssh-keygen -q -N "" -t rsa -b 4096 -f /etc/ssh/ssh_host_rsa_key - track that we generated keys, then delete the keys post script run. This would allow sshd -T to run. It would not enable ssh access.

Thoughts on the 2 methods @georgalis and @bernstei ?

I think I prefer 2. That way if someone does enable sshd later, the necessary configuration options are still there in the relevant files. Obviously it doesn't remove the need to re-run the check script after changing the configuration to enable sshd, but at least it reduces the chances of accidentally enabling a non-secured sshd config.

georgalis commented 1 year ago

@robertgendler regarding (1) this would seem a stylistic variation of my PR #248 checks, is there a style guide? The PR style embraces the original syntax, with concise addition of logical operators and braces to fortify the logic, as I would prefer. Regarding (2), creation of host keys contradicts the "If SSHD is enabled" aspect of the rule. I would not object to generating host keys for the purpose of checking sshd config "if sshd were enabled," although I would bias toward the vendor key generation method (whatever that happens to be---I've enabled sshd on mac but I've never manually created mac keys), least confounding vendor managed aspects. More critically, keys are used to uniquely, crypto-graphically, identify the host. If they are deleted, doesn't that bring into question the legitimacy of their purpose for the check, and the keys themselves if there is a reason to delete them?

Modifying my PR to generate keys with invoke systemsetup -setremotelogin on and subsequently turn it off after the checks might be effective, but potentially starts a non-complaint service. There is a dilemma of vendor tools vs fips compliance. For a fips 140 check/fix that fits into this framework, I would create a new rule rules/os/os_sshd_hostkeys to check/fix valid host keys, create complaint keys with ssh-keygen, if needed, and leave them in place. Then let the other sshd rules depend on the hostkeys rule. Although I do not immediately see how rule dependency is implemented.

That would also allow users without sudo to validate sshd compliance too---If that is a requirement, there are more regressions to consider. For example, in the absence of hostkeys a regex rewrite of config files for compliance, would touch less for non-sshd configured hosts, and improve the likelihood of compliance, if sshd were later enabled. Brute force config better captures the spirit of the rule, but sshd -T is a better test, I digress.

My vote is for a hostkeys generation (via ssh-keygen) check/fix rule that the other sshd configs depend on.

georgalis commented 1 year ago

@robertgendler BTW the system I've used for many years to bring a vendor OS config to my standard is here: https://github.com/georgalis/pub/blob/master/boot/nbsd/hostroot it makes pki (typically ed25519) a requirement for ssh, and moves the whole key management aspect out of user control and into a root administration task, with a new AuthorizedKeysFile path. It's my brute force fix sshd config method.

georgalis commented 1 year ago

@robertgendler considering an alternate method for the sshd check/fix rules, a new approach is if sshd -T fails the check, use the include_dir and write a 00-ossshd* file for each fix (less logic). A side effect of this approach is if there are no host keys, just write the rules into their respective include files, rely on the first match config parsing, and no need to generate host keys.

Several questions came up. 1) in os_sshd_client_alive_count_max_configure.yaml, what is "${include_dir}01-mscp-sshd.conf" and why rename to ${include_dir}20-${file}? ---I am not following the logic of the loop and file renaming. 2) in that file, the discussion doesn't exactly align to the config option, the config option is confusing, but the parameter does not refer to seconds it refers to retries. Where is the upstream for that discussion, can we clean this up? Where is $ODV set and why is it not simply hardcoded to zero in the check? (odv: below says hint "Number of seconds." recommended: 0) 3) other rule discussions refer to FIPS 140-2 when FIPS 140 would probably be better, other aspects of the standard text could benefit from normalization across the rules?

georgalis commented 1 year ago

As POC for my last comment, here is how I would implement respective sections of os_sshd_permit_root_login_configure.yaml

id: os_sshd_permit_root_login_configure
check: |
  /usr/sbin/sshd -T | /usr/bin/awk '/permitrootlogin/{print $2}'
result:
  string: "no"
fix: |
  [source,bash]
  ----
  include_dir="$(/usr/bin/awk '/^Include/ {print $2}' /etc/ssh/sshd_config | /usr/bin/tr -d '*')"

  test    "$include_dir" || include_dir="/etc/ssh/sshd_config.d"
  test -d "$include_dir" || mkdir -p "$include_dir"

  cat >"$include_dir/os_sshd_permit_root_login_configure" <<eof
# Disable Root Login for SSH
# https://github.com/usnistgov/macos_security/blob/ventura/rules/os/os_sshd_permit_root_login_configure.yaml
permitrootlogin no
eof

  grep "^$include_dir" /etc/ssh/sshd_config >/devnull || {
    echo "Include /etc/ssh/sshd_config.d/*" >/etc/ssh/sshd_config~ \
    && cat /etc/ssh/sshd_config            >>/etc/ssh/sshd_config~ \
    && mv  /etc/ssh/sshd_config~             /etc/ssh/sshd_config ;}
  ----

This is to embrace a one-include-file-per-rule approach. In the spirit of, "if it's not in a vendor state, or prior fixed state, make an effort to remediate anyway," I've set include_dir and created one if it doesn't exist. That may not be the best approach because who knows what another state might be and how would it be vetted?

Regarding brace and logic operators vs if then else statements, my motive is more comprehensible functions with complicated logic. When the logic flow doesn't fit on a screen, I've found this notation is far easier to debug, once familiarity is gained. Sometimes, I use declare -f function_with_brace_logic to hunt in shell standard formatting. Unfortunately, shell does not have an if then elseif statement, achieving that pattern led me to brace/operators.

It may be advantageous to alter the sshd_config file directly so the rule check/fix can be used directly on Linux or another OS that doesn't support the MacOS include patch? I'm ambivalent about that (this is easier, and awking blocks of code into the config will be hard to read), but has the question of one include-file-per-rule been discussed? What about the reasoning of permitrootlogin no what is the best forum to discuss that? The only explanation I can find is an individual identity must authenticate, before group privilege tasks, which is precisely the foundation by which I would use permitrootlogin prohibit-password and configure appropriate identity keys for root access---vetting sudo, pam, and ldap data is more problematic than installing the individual keys of users authorized for root access. The PKI identity fingerprint is in the log. I don't connect to macs via ssh so meh, but I do believe this is important, after all, how many orgs have error free LDAP data aligned with their HR data and security policy? If the point is control and auditability, pam, sudo, and ldap configs have a much greater threat surface area than limiting (root) ssh connection to pki. If the rule is no identity ambiguity therefore no root logins, there seems to have been implementation assumptions that don't apply to an alternate threat model.

robertgendler commented 1 year ago

@georgalis and @bernstei check out dev_ventura_issue245

georgalis commented 1 year ago

@robertgendler and @bernstei I have just finished some busy time. As next steps, I was planning to comment on drafts for https://www.nist.gov/identity-access-management and possibly https://www.nist.gov/cyberframework/updating-nist-cybersecurity-framework-journey-csf-20 As I review those drafts, I'll also see about a PR to factor the needful in dev_ventura_issue245 to address #245, without a dev ventura I'll be depending on you for testing/validation of the PR.

georgalis commented 1 year ago

@robertgendler I am not following the workflow to contrib to this project. Normally, for a project/repo I do not have write perms, I would fork to a repo I own, make changes, and create a PR back to the main branch of the original. Here we have branches for each major OS version. Okay, but what is the main branch for? After my original PR to main, per request, I deleted it and refactored the changes to the Ventura branch (and made additional comments). Two days ago the comment above indicates a new branch dev_ventura_issue245 so I proceeded to sync that branch to my fork, with the intention of coding resolution or improvement of #245 issues, to that branch in my fork---which I would then create a new PR from. Apparently the sync operation squashed changes I was not privy to, performed some confusing merges, and revised my PR... all very confusing.

I though this would be an easy contrib, but I clearly need a better understanding on how the 3rd party contrib workflow, works. At this point my effort would be easier to reproduce in new forks and PR, than unwinding what happened. I've only not deleted the PR (yet) as an artifact of the problem.

@robertgendler did you intend me to review dev_ventura_issue245 as if it was a PR (to what branch), or was that for my PR to go into? I did not review the Contributing section prior to responding to the #245, but in any event it may be helpful to revise the steps in https://github.com/usnistgov/macos_security/wiki/Contributing#contributing-code to reflect the branching, release, and contrib workflows here?

robertgendler commented 1 year ago

@georgalis I understand the confusion on how this Git project works because it isn't the usual traditional workflow. So we have the OS branches and main mirrors the currently shipping OS from Apple. However, we only update main when we do a full release (2 or 3 times a year). This is why we tell people work off of the OS branches as it'll be the most up to date since we pull dev branches into them somewhat regularly. If you do a PR, do it off of an OS branch.

Your PR was 1 way to handle the sshd issue, after talking it over with the other folks, we decided to go in this direction of generating the ssh keys required, then destroying them, if they don't exist. So I just wanted you all to test it since this has been good discussion

Not being the git expert I have no idea how to have properly sync'd things so it wouldn't break things you potentially already did.

bernstei commented 1 year ago

@georgalis and @bernstei check out dev_ventura_issue245

Finally trying it by merging it into the dev_ventura_stig branch. I'll let you know what happens,

bernstei commented 1 year ago

I just ran the compliance script, and it immediately printed

sshd: no hostkeys available -- exiting.

and it's been hanging for quite a few minutes (at least 5-10 so far). How long is it expected to take? Is this a sign of something going wrong?

robertgendler commented 1 year ago

@bernstei if things merged properly....after you build the compliance script, check line 28-34 (and there's some at the end to clean up). Basically these are the lines that get added

ssh_key_check=0
if /usr/sbin/sshd -T &2> /dev/null; then
    ssh_key_check=0
else
    /usr/bin/ssh-keygen -q -N "" -t rsa -b 4096 -f /etc/ssh/ssh_host_rsa_key
    ssh_key_check=1
fi

I think we have branchitis going on and too many. So it may be time to merge a few together to make all this a smidge easier.

bernstei commented 1 year ago

The lines are present in the compliance script. Let me go in and debug where exactly it's hanging, and see if I can figure out why.

robertgendler commented 1 year ago

If you hold down the SHIFT key when you hit enter to execute the --check. It run in debug mode and show lots of output.

bernstei commented 1 year ago

Here is the debug output

-----DEBUG-----

###################  COMMANDS START BELOW THIS LINE  ###################

## Must be run as root
if [[ $EUID -ne 0 ]]; then
    echo "This script must be run as root"
    exit 1
fi
+./stig_OS_13.6393_compliance.sh:23> [[ 0 -ne 0 ]]

ssh_key_check=0
+./stig_OS_13.6393_compliance.sh:28> ssh_key_check=0 
if /usr/sbin/sshd -T &2> /dev/null; then
    ssh_key_check=0
else
    /usr/bin/ssh-keygen -q -N "" -t rsa -b 4096 -f /etc/ssh/ssh_host_rsa_key
    ssh_key_check=1
fi
+./stig_OS_13.6393_compliance.sh:29> /usr/sbin/sshd -T
+./stig_OS_13.6393_compliance.sh:29> cat
sshd: no hostkeys available -- exiting.
^C

When I add old-fashioned "echo" statements between every line in that section the last one that prints it the one before the if block. No echos inside the if or else print, nor does one after the fi. When I run the sshd -T command manually it does not hang, but it also looks like it's interpreting the & as running in the background, rather than the usual shell file number redirect. Is there a chance that's just the wrong syntax, and it should be 2&> /dev/null or something like that?

robertgendler commented 1 year ago

That's weird...try replacing that if statement with if sshd -T &>/dev/null; then

Lets see if that works.

georgalis commented 1 year ago

try like this... if sshd -T >/dev/null 2>&1; then echo t ; else echo n ; fi

bernstei commented 1 year ago

Replacing the &2> with 2&> definitely works. Replacing it with your suggestion, &>, also works.

robertgendler commented 1 year ago

Fixed that boo boo.

bernstei commented 1 year ago

The actual checks work now, too, so as far as I'm concerned you can close this issue.

robertgendler commented 1 year ago

All merged into the OS branches.

bernstei commented 1 year ago

I assume that for ventura this is only merged into ventura. What's the best way to deal with that if we want to use Ventura with the STIG, which is in dev_ventura_stig? Merge the ventura into dev_ventura_stig?

robertgendler commented 1 year ago

@bernstei just merged it up to dev_ventura_stig

usnistgov / macos_security

sshd checks sometimes fail for reasons other than the rule #245