rancher / fleet

Deploy workloads from Git to large fleets of Kubernetes clusters
https://fleet.rancher.io/
Apache License 2.0
1.52k stars 229 forks source link

No matching host key type found. Their offer: ssh-rsa #750

Closed dtrouillet closed 2 years ago

dtrouillet commented 2 years ago

Hello,

When we proceed to an update of Rancher from 2.6.3 to 2.6.4 (so upgrade fleet to 0.3.9), we faced an issue regarding Fleet. Everything was fine on 2.6.3 but since we proceeded to the update, we are facing this issue. Here is the message we have of every GitRepo:

git ls-remote ssh://git@bitbucket.mydomain.fr:8888/PP0/mygit.git refs/heads/master error: exit status 128, detail: Unable to negotiate with x.x.x.x port 8888: no matching host key type found. Their offer: ssh-rsa
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

Regards

Martin-Weiss commented 2 years ago

We have seen the same issue and as a workaround switched to https based git repo configs..

dtrouillet commented 2 years ago

Thanks for your feedback. I'm agree with you, this workaround works but we want to use SSH 😕

gravufo commented 2 years ago

We have the same issue pointing to an Azure DevOps git repo with an SSH key. This is pretty bad, are there no tests to validate changes? This is a pretty basic use case.

gravufo commented 2 years ago

Even going to HTTPS using a PAT doesn't work. This is seriously critical.

dtrouillet commented 2 years ago

Hello, why this issue is reported to 2.6.7. This issue is blocking all update since 2.6.4...

aiyengar2 commented 2 years ago

Hmm, based on the original error message, it's possible this issue is because the ssh:// should be omitted.

Pulling in a GitRepository is executed by https://github.com/rancher/gitjob, which uses https://github.com/rancher/wrangler to execute the Git command in this operation: https://github.com/rancher/wrangler/blob/bf1502dba94db9cf5259e2894ba42dcbc63e5646/pkg/git/git.go#L71-L73

This, in turn, is just a command that gets run within the container using the prepackaged git: https://github.com/rancher/wrangler/blob/bf1502dba94db9cf5259e2894ba42dcbc63e5646/pkg/git/git.go#L321

As a result, the expected value for the hostname should be git@bitbucket.mydomain.fr:8888/PP0/mygit.git, not ssh://git@bitbucket.mydomain.fr:8888/PP0/mygit.git.

arvindiyengar: ~/Rancher/fleet/src/github.com/rancher/fleet
$ git ls-remote  git@github.com:aiyengar2/fleet-examples.git
aed1efaf4b918afae3551d85891468ace5ea031c    HEAD
2526d1719a516feb8e83c968276808c3e8e30632    refs/heads/dev
914488fcecc42dde315308298bef9ee9656e85b8    refs/heads/issue_36947
aed1efaf4b918afae3551d85891468ace5ea031c    refs/heads/master
56bca25f648a951c2f8fd6db4981e4a4f040ca4e    refs/tags/example
arvindiyengar: ~/Rancher/fleet/src/github.com/rancher/fleet
$ git ls-remote  ssh://git@github.com:aiyengar2/fleet-examples.git
ssh: Could not resolve hostname github.com:aiyengar2: nodename nor servname provided, or not known
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

On using the repository URL git@github.com:aiyengar2/fleet-examples.git and specifying Git Authentication -> Create a SSH Key Secret with a valid public and private key, such that the public key is recognized by your system, I am able to use SSH-based host keys successfully.

So from a backend perspective, this doesn't seem to be an issue (unless users can provide some more steps to reproduce?) or this should be opened up as an issue in github.com/rancher/dashboard in order to automatically sanitize the ssh:// or https:// out of links before adding them to the GitRepo CR.

aiyengar2 commented 2 years ago

@gravufo is it possible that you either:

  1. have not added the SSH key to the GitRepo that provides it with a public/private key pair to perform SSH through
  2. have not added the public key from the key pair to your Git hosting solution to allow it to perform the SSH request
gravufo commented 2 years ago

I don't believe any of your suggestions apply. It works with the previous version and doesn't work with the update with the exact same settings and setup. There is clearly an issue.

dtrouillet commented 2 years ago

Warning, the problem is related to the algorithm used (ssh-rsa) by the server for the ssh host key. So it may not be reproducible with Github. And I confirm that the configuration has absolutely not changed since version 2.6.3 of Rancher.

izaac commented 2 years ago

Found another issue trying to reproduce this one.

aiyengar2 commented 2 years ago

@gravufo I don't disagree with your reasoning that Fleet shouldn't be broken on an upgrade (assuming Fleet is the root cause and not some other environmental issue on your end...).

But there are no steps on this ticket that allow me to reproduce the issue that you are experiencing to be able to ascertain what the fix should be. Can you point me in a direction with some steps on how I can reproduce it? Based on the current ticket description, I do not see the issue that you are describing on the latest Fleet release.

aiyengar2 commented 2 years ago

@dtrouillet based on the issue described in the error, I agree with you that the root cause here seems to be related to the fact that the (Git) server that Fleet is talking to is not accepting the cipher / algorithm that Fleet is reaching out to it with (ssh-rsa). This is why the error states Unable to negotiate... no matching host key type found... Their offer: ssh-rsa.

However, wouldn't the choice of algorithm be based on the contents of the SSH public and private keys provided to the Fleet GitRepo? If your Git Server does not accept ssh-rsa anymore, I would assume the fix here should be to swap the SSH keys provided to Fleet as part of configuring the GitRepo to an acceptable, more secure cipher (e.g. ecdsa-sk, ed25519-sk, etc.) whose public key is uploaded to your Git server.

aiyengar2 commented 2 years ago

I've been looking into this more and I still can't seem to find any reason why Fleet would have incurred any change that would cause this behavior to occur between versions 0.3.8 to 0.3.9.

https://github.com/rancher/fleet/compare/v0.3.8...v0.3.9

The only noticeable / relevant changes here appear to be the changes to the underlying libraries executing the Git Repository operation, namely:

gitjob:
  repository: rancher/gitjob
  tag: v0.1.26

tekton:
  repository: rancher/tekton-utils
  tag: v0.1.5 

The github.com/hashicorp/go-getter v1.5.11 bump seems to be very unlikely... same with the alpine changes. And the changes to rancher/gitjob (here) and the changes to tekton-utils (here) are also just changing the base image to a higher version of alpine.

I also did try using a BitBucket based private repository (as described in the ticket), but it appears like the OpenSSH 8.8 bump (which is where ssh-rsa seems to have been dropped, re: https://www.linuxadictos.com/en/openssh-8-8-arrives-saying-goodbye-to-ssh-rsa-support-bug-fixes-and-more.html) may not be affecting public BitBucket instances yet, only maybe BitBucket Cloud (enterprise).

Since there's no clear direction of where to proceed with this issue unless we have steps to reproduce the issue and identify what the root cause is, I will move this issue to Need Info and remove the milestone from it (cc: @MKlimuszka).

If we can get steps to reproduce, I'd be happy to take a look once again.

manno commented 2 years ago

It's likely that the server didn't support a more advanced signature algorithm, just like the Azure issue in https://github.com/rancher/fleet/issues/773#issuecomment-1128324790

If I understand correctly, all OpenSSH clients since 7.2 should be able to upgrade the algorithm.

While there are different ssh versions in play, they should all be new enough:

We have ssh clients in several images: rancher/gitjob, rancher/tekton-utils

@dtrouillet like @aiyengar2 said, it's hard to understand why Rancher 2.6.3 would have worked with such a server. Can you provide more information on that server, the OpenSSH version, any special configuration for the algorithms? Maybe this bitbucket isssue is relevant: https://jira.atlassian.com/browse/BSERV-10175

I also noticed the imagescan feature uses the go-git module and they have a similar issue https://github.com/go-git/go-git/issues/516

dtrouillet commented 2 years ago

It seems, the new fleet version 0.3.10 fix this issue.

harryssuperman commented 2 years ago

Hi Folks,

@dtrouillet please check the ssh version of your Bitbucket Repository if you can.

I was having similar Problem with Git Update (latest Version for my client pc was the 2.37.2) which come with a OpenSSH > 8.8. At release notes https://www.openssh.com/txt/release-8.8 i read the "Potentially-incompatible changes" and with some extra config in the config file under .ssh folder with next properties: Host XXX HostkeyAlgorithms +ssh-rsa PubkeyAuthentication yes PubkeyAcceptedKeyTypes=+ssh-rsa IdentityFile ~/.ssh/id_rsa_old We got it.

Take into account this is a workaround. Keys and OpenSSH should be updated in both sides of the SSH Connection.

Please read link with similar issue, specially Benchmark at the end:

https://ikarus.sg/rsa-is-not-dead/

thardeck commented 2 years ago

It seems, the new fleet version 0.3.10 fix this issue.

Thanks for the update @dtrouillet . Can you confirm that the issue is fixed for you, then I think we can close this issue?

dtrouillet commented 2 years ago

Hi, I confirm that! You Can close this issue.

Regards