canonical / cloud-init

Official upstream for the cloud-init: cloud instance initialization
https://cloud-init.io/
Other
2.99k stars 883 forks source link

enlisting of nodes: seed_random fails due to self signed certificate #2526

Closed ubuntu-server-builder closed 1 year ago

ubuntu-server-builder commented 1 year ago

This bug was originally filed in Launchpad as LP: #1424549

Launchpad details
affected_projects = ['maas']
assignee = None
assignee_name = None
date_closed = 2019-02-19T16:09:59.135828+00:00
date_created = 2015-02-23T08:38:22.765951+00:00
date_fix_committed = None
date_fix_released = None
id = 1424549
importance = undecided
is_complete = True
lp_url = https://bugs.launchpad.net/cloud-init/+bug/1424549
milestone = None
owner = mpontillo
owner_name = Mike Pontillo
private = False
status = invalid
submitter = martin-nowack
submitter_name = Martin Nowack
tags = []
duplicates = []

Launchpad user Martin Nowack(martin-nowack) wrote on 2015-02-23T08:38:22.765951+00:00

Using Maas 1.7.1 on trusty, the following error message in the MAAS provided ephemeral image for the step pollinate is executed:

curl: SSL certificate problem: self signed certificate in certificate chain.

This way random number generator is not initialized correctly.

ubuntu-server-builder commented 1 year ago

Launchpad user Blake Rouse(blake-rouse) wrote on 2015-02-26T14:06:30.793981+00:00

Does your node have full access to the internet?

Can you provide the dmesg output for this error?

ubuntu-server-builder commented 1 year ago

Launchpad user Martin Nowack(martin-nowack) wrote on 2015-02-26T14:36:59.751776+00:00

Yes, the node has full access to the internet:

By explicitly executing: sudo pollinate

I get: Feb 26 14:35:23 stream1 pollinate[26489]: client sent challenge to [https://entropy.ubuntu.com/] Feb 26 14:35:24 stream1 pollinate[26513]: ERROR: Network communication failed [60]\n14:35:24.298494 Hostname was NOT found in DNS cache % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 014:35:24.299301 Trying 91.189.94.50... 14:35:24.324169 Connected to entropy.ubuntu.com (91.189.94.50) port 443 (#0) 14:35:24.325201 successfully set certificate verify locations: 14:35:24.325254 CAfile: /etc/pollinate/entropy.ubuntu.com.pem CApath: /dev/null 14:35:24.325410 SSLv3, TLS handshake, Client hello (1): 14:35:24.325460 } [data not shown] 14:35:24.350528 SSLv3, TLS handshake, Server hello (2): 14:35:24.350592 { [data not shown] 14:35:24.363801 SSLv3, TLS handshake, CERT (11): 14:35:24.363852 { [data not shown] 14:35:24.364434 SSLv3, TLS alert, Server hello (2): 14:35:24.364486 } [data not shown] 14:35:24.364643 SSL certificate problem: self signed certificate in certificate chain 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 14:35:24.364884 * Closing connection 0 curl: (60) SSL certificate problem: self signed certificate in certificate chain More details here: http://curl.haxx.se/docs/sslcerts.html

curl performs SSL certificate verification by default, using a "bundle" of Certificate Authority (CA) public keys (CA certs). If the default bundle file isn't adequate, you can specify an alternate file using the --cacert option. If this HTTPS server uses a certificate signed by a CA represented in the bundle, the certificate verification probably failed due to a problem with the certificate (it might be expired, or the name might not match the domain name in the URL). If you'd like to turn off curl's verification of the certificate, use the -k (or --insecure) option.

ubuntu-server-builder commented 1 year ago

Launchpad user Martin Nowack(martin-nowack) wrote on 2015-02-26T14:39:10.878118+00:00

One side remark, this is now executed from the deployed image. So, it's not only important for enlisting only.

ubuntu-server-builder commented 1 year ago

Launchpad user Launchpad Janitor(janitor) wrote on 2015-04-28T04:18:16.541557+00:00

[Expired for MAAS because there has been no activity for 60 days.]

ubuntu-server-builder commented 1 year ago

Launchpad user Guy Halfon(s-gh) wrote on 2015-07-05T07:52:50.269394+00:00

I'm getting the same behavior in Vivid.

ubuntu-server-builder commented 1 year ago

Launchpad user Karl(karl-martin2) wrote on 2015-10-14T14:07:23.915578+00:00

This is happening still and can't seem to find a work around:

backdoor@maas-enlisting-node:~$ sudo pollinate sudo: unable to resolve host maas-enlisting-node Oct 14 14:06:51 maas-enlisting-node pollinate[1776]: client sent challenge to [https://entropy.ubuntu.com/] Oct 14 14:06:51 maas-enlisting-node pollinate[1800]: ERROR: Network communication failed [60]\n14:06:51.133088 Hostname was NOT found in DNS cache % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 014:06:51.137568 Trying 91.189.94.53... 14:06:51.271750 Connected to entropy.ubuntu.com (91.189.94.53) port 443 (#0) 14:06:51.272710 successfully set certificate verify locations: 14:06:51.272731 CAfile: /etc/pollinate/entropy.ubuntu.com.pem CApath: /dev/null 14:06:51.272849 SSLv3, TLS handshake, Client hello (1): 14:06:51.272884 } [data not shown] 14:06:51.404391 SSLv3, TLS handshake, Server hello (2): 14:06:51.404432 { [data not shown] 14:06:51.417184 SSLv3, TLS handshake, CERT (11): 14:06:51.417235 { [data not shown] 14:06:51.417754 SSLv3, TLS alert, Server hello (2): 14:06:51.417776 } [data not shown] 14:06:51.417853 SSL certificate problem: self signed certificate in certificate chain 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 14:06:51.417928 * Closing connection 0 curl: (60) SSL certificate problem: self signed certificate in certificate chain More details here: http://curl.haxx.se/docs/sslcerts.html

curl performs SSL certificate verification by default, using a "bundle" of Certificate Authority (CA) public keys (CA certs). If the default bundle file isn't adequate, you can specify an alternate file using the --cacert option. If this HTTPS server uses a certificate signed by a CA represented in the bundle, the certificate verification probably failed due to a problem with the certificate (it might be expired, or the name might not match the domain name in the URL). If you'd like to turn off curl's verification of the certificate, use the -k (or --insecure) option.

ubuntu-server-builder commented 1 year ago

Launchpad user Mike Pontillo(mpontillo) wrote on 2015-10-14T16:30:51.374526+00:00

Are you behind an HTTP proxy, or a network security device that could be substituting officially-signed X.509 certificates with X.509 certificates signed by your organization?

From a machine on the same network as the node being deployed, can you pastebin the output of the following:

openssl s_client -connect entropy.ubuntu.com:443 \ -showcerts \ -CApath /etc/ssl/certs < /dev/null

ubuntu-server-builder commented 1 year ago

Launchpad user Karl(karl-martin2) wrote on 2015-10-14T16:39:38.252495+00:00

Here you go. Launchpad attachments: openssl-output.txt

ubuntu-server-builder commented 1 year ago

Launchpad user Karl(karl-martin2) wrote on 2015-10-14T16:46:30.945955+00:00

I am not behind an HTTP proxy - the node being deployed does access the internet through a pfsense router. The nodes being deployed are VM's.

ubuntu-server-builder commented 1 year ago

Launchpad user Mike Pontillo(mpontillo) wrote on 2015-10-14T17:32:21.943191+00:00

Hmm, strange. The certificates you posted in the openssl trace match the ones in /etc/pollinate/entropy.ubuntu.com.pem. For it to not validate, I would have expected them to be different.

Is it possible that the system time is incorrect on the VM, which in turn causes the certificates to not validate for some reason? (from what I've seen in your debug output, it's probably correct, but I'm running out of theories now.)

From the same node where you ran 'openssl s_client', I'm curious if there is a difference between the output of the following two commands:

pollinate -t > /dev/null pollinate -i -t > /dev/null

Are you certain that the pfsense router is not acting as a man-in-the-middle for some types of traffic? (Again, though - if it is, I'm just not sure why we wouldn't have seen signs of that in the OpenSSL output.)

ubuntu-server-builder commented 1 year ago

Launchpad user Karl(karl-martin2) wrote on 2015-10-14T17:41:12.010691+00:00

I attached the output, but I was noticing earlier this morning when I was TS that I could use the insecure flag of pollinate and it would seed correctly. pfSense router shouldn't be modifying anything, I went through the configs a few times and could not find anything. It is a minimal installation to route traffic to the internet only, but I have been spending time to research if it could be modifying anything. The openssl s_client commands were run through there as well. Launchpad attachments: pollinate-output.txt

ubuntu-server-builder commented 1 year ago

Launchpad user Karl(karl-martin2) wrote on 2015-10-14T18:20:33.862770+00:00

I ensured the MAAS and VM times match and both are set to UTC. I noticed when the image is booting up, one service fails and it had to do with entropy seeding on first boot. It scrolled too fast to quite catch it all and I don't see it in the logs, but it said failed to start pseudo random number generator. Something might be going wonky affecting entropy?

ubuntu-server-builder commented 1 year ago

Launchpad user Mike Pontillo(mpontillo) wrote on 2015-10-14T18:46:33.906789+00:00

OK. I'll set this bug to "Confirmed" since it affects multiple users. But we won't be able to move forward on this bug unless we can triage this enough to determine why [in some circumstances] pollinate can't validate what appears to be a perfectly good certificate.

I suppose it could be a 'curl' issue. Perhaps if we can find out the exact 'curl' command pollinate is running, we can narrow it down. From what I understood from the logs, could you please try comparing the output from the following two commands:

curl -v --cacert /etc/pollinate/entropy.ubuntu.com.pem \ --capath /dev/null https://entropy.ubuntu.com/

curl --insecure -v --cacert /etc/pollinate/entropy.ubuntu.com.pem \ --capath /dev/null https://entropy.ubuntu.com/

It could be useful to get a packet capture from curl and/or pollinate (so we can see the certificates present in the TLS headers), if we have reason to believe they would be different from your OpenSSL output.

The other question to ask is: what images URL are you using, and which subset of images are you working with? (I assumed you were using the default URL and deploying amd64 images.)

I can only reproduce this bug if I edit /etc/pollinate/entropy.ubuntu.com.pem and remove a subset of the trusted certificates.

But here's the other curious part [in my output]:

It looks like the certificate is due to expire tomorrow. Which might mean two things:

ubuntu-server-builder commented 1 year ago

Launchpad user Karl(karl-martin2) wrote on 2015-10-14T19:34:43.952537+00:00

MAAS Version 1.8.2+bzr4041-0ubuntu1 (trusty1) - This is a stock install from the repos, nothing extra.

I am using the 14.04 LTS image - AMD64 http://archive.ubuntu.com/ubuntu http://ports.ubuntu.com/ubuntu-ports http://maas.ubuntu.com/images/ephemeral-v2/releases/

I took dumps of the curl cmnds, normal vs insecure. I really didn't see anything in wireshark that stood out.

Let me know if you want me to gather any other info. Launchpad attachments: pcaps.tar.gz

ubuntu-server-builder commented 1 year ago

Launchpad user Mike Pontillo(mpontillo) wrote on 2015-10-14T19:59:15.833761+00:00

Sorry for all the questions. I looked at the Wireshark output, and it looks okay. (it confirms that there is no clock skew, and that the certificates have the expected validity dates, but I didn't dig any deeper than that.) Two more questions:

(1) What are the contents of /etc/pollinate/entropy.ubuntu.com.pem on your system? (perhaps yours has been updated in advance of a pending certificate change, and mine hasn't? ideally you'd want there to be some overlap for a certain time, to prevent this race condition, if that's true.)

(2) What is the output of the following command: curl -v --capath /etc/ssl/certs https://entropy.ubuntu.com

To explain, (1) above is a sanity check to determine if you're using the same trust roots that I'm seeing locally. (2) checks if using the default trust roots on the system lead to success. (perhaps that should be the default, since the way we've "pinned" this certificate seems to be problematic.)

ubuntu-server-builder commented 1 year ago

Launchpad user Karl(karl-martin2) wrote on 2015-10-14T20:07:42.663639+00:00

backdoor@os-test:~$ curl -v --capath /etc/ssl/certs https://entropy.ubuntu.com

Launchpad attachments: pem-output.txt

ubuntu-server-builder commented 1 year ago

Launchpad user Mike Pontillo(mpontillo) wrote on 2015-10-14T20:35:26.834916+00:00

It seems that the problem is (1). (but it isn't quite what I expected) The certificates in your file are completely different from what I would expect, in order to properly validate. The leaf certificate in your file (per "openssl x509 -inform pem -in -text", after placing the individual certificate into ) is the following:

    Issuer: C=US, ST=Arizona, L=Scottsdale, O=Starfield Technologies, Inc., OU=http://certs.starfieldtech.com/repository/, CN=Starfield Secure Certificate Authority - G2
    Validity
        Not Before: Apr  8 08:26:03 2014 GMT
        Not After : Oct 15 16:10:53 2014 GMT
    Subject: OU=Domain Control Validated, CN=entropy.ubuntu.com

The remainder of the certificates in the file are the CA and intermediate certificates.

Maybe out of date MAAS images are at fault? (though if the packages get updated, you shouldn't see this problem, since you'll get a new "pinned" certificate chain.) You could try updating the MAAS images, or even try using the 'daily' URL (which is updated for security updates and/or every couple of weeks with the latest updated packages):

https://maas.ubuntu.com/images/ephemeral-v2/daily/

Perhaps the daily images contain the appropriate certificates. And I hope that's still the case in 20 hours. ;-) I just checked, and the following certificate is actually in my pinned trust store:

    Issuer: C=US, O=DigiCert Inc, CN=DigiCert SHA2 Secure Server CA
    Validity
        Not Before: Aug  7 00:00:00 2015 GMT
        Not After : Aug 11 12:00:00 2016 GMT
    Subject: C=GB, ST=Southwark, L=London, O=Canonical Group Ltd, CN=entropy.ubuntu.com

So my conclusion is that everything should work fine, provided that you have the most up-to-date MAAS images.

ubuntu-server-builder commented 1 year ago

Launchpad user Karl(karl-martin2) wrote on 2015-10-14T21:05:16.628631+00:00

OK, that would be the curious thing is how I would have received a cert from 2014 when I installed the MAAS server fresh on 10/10/2015?

I just now ran: maas admin node-groups import-boot-images after switching to the daily URL that you have provided, but when I restarted the node the certificate had not changed?

These are the repos I added when performing the installation: sudo add-apt-repository ppa:juju/stable sudo add-apt-repository ppa:maas-maintainers/stable sudo add-apt-repository ppa:cloud-installer/stable sudo apt update

Source: http://www.ubuntu.com/download/cloud/install-ubuntu-openstack

ubuntu-server-builder commented 1 year ago

Launchpad user Karl(karl-martin2) wrote on 2015-10-15T14:33:17.192145+00:00

The auto-updater for the ephemeral images don't seem to be updating every 60 mins.

I went to: https://maas.ubuntu.com/images/ephemeral-v2/daily/trusty/amd64/20150930/ and downloaded them to: /var/lib/maas/boot-resources/current/ubuntu/amd64/generic/trusty/release

Updated these files from ubuntu: -rw-r--r-- 1 maas maas 1.4G Oct 5 23:09 root-image -rw-r--r-- 1 maas maas 5.6M Oct 5 23:09 boot-kernel -rw-r--r-- 1 maas maas 24M Oct 5 23:09 boot-initrd

After the image booted up the cert still read: Validity Not Before: Apr 8 08:26:03 2014 GMT Not After : Oct 15 16:10:53 2014 GMT

Are those the correct files to update and the correct location?

ubuntu-server-builder commented 1 year ago

Launchpad user Mike Pontillo(mpontillo) wrote on 2015-10-16T00:55:49.107692+00:00

Rather than downloading the images and replacing them in /var, can you change your image sync URL to https://maas.ubuntu.com/images/ephemeral-v2/daily/ and redeploy the node?

Images are kept in the database in the region and periodically synchronized to the clusters; manually changing them on the cluster is not supported.

ubuntu-server-builder commented 1 year ago

Launchpad user Mike Pontillo(mpontillo) wrote on 2015-10-16T00:56:30.606950+00:00

Also (just curious) - which hypervisor are you using?

ubuntu-server-builder commented 1 year ago

Launchpad user Karl(karl-martin2) wrote on 2015-10-19T16:51:45.381648+00:00

Mike,

I am using ESXi 6 and have a fresh install with the image URL pointing to dailys.

MAAS Version 1.8.2+bzr4041-0ubuntu1 (trusty1)

Thanks

ubuntu-server-builder commented 1 year ago

Launchpad user Andres Rodriguez(andreserl) wrote on 2015-10-19T17:58:44.404017+00:00

Quick questions:

  1. Are you using MAAS DNS?
  2. If you are using MAAS DNS, are you using an upstream DNS server?
  3. Are you enabling DNSSEC validation?

Thanks

ubuntu-server-builder commented 1 year ago

Launchpad user Andres Rodriguez(andreserl) wrote on 2015-10-19T18:06:45.003909+00:00

Also, the big question is... why is this running at all? Why would this cause a failure? When working under offlines environment, where we don't have access to the internet at all, this doesn't represent an issue.

ubuntu-server-builder commented 1 year ago

Launchpad user Andres Rodriguez(andreserl) wrote on 2015-10-19T18:14:41.578486+00:00

Ok, so looking into this further, this may be because of using an older version of pollinate. This, however, shouldn't really represent an issue at all.

The latest pollinate 1, might shed some light. So, the big question now is, have you tried the latest image? (changing from 'releases' to 'daily' for the streams)?

ubuntu-server-builder commented 1 year ago

Launchpad user Karl(karl-martin2) wrote on 2015-10-19T19:23:41.758737+00:00

I changed the URL from releases to Daily, and it now boots up with 14.04.3 LTS, but I am unable to ssh into the box, but eventually I see connection failures to archive.ubuntu.com, so I am guessing it still is not working.

It was my understanding from the docs, that after enlisting and shutting down, you commission the node, it boots up and needs to communicate to archive.ubuntu.com to download packages and install them prior to finishing the commissioning of the nodes to ready state? How does this work for you in an environment that does not have internet connection?

Previously, I had used the code below to create a backdoor login account. But since re-installing MAAS, I wanted to leave it stock and was not sure if this code modified the image, because when I had changed from releases to daily on the previous build it didn't seem like the images updated.

=-=-=-=- sudo apt-get install --assume-yes bzr bzr branch lp:~maas-maintainers/maas/backdoor-image backdoor-image

imgs=$(echo /var/lib/maas/boot-resources///////root-image) for img in $imgs; do [ -f "$img.dist" ] || sudo cp -a --sparse=always $img $img.dist done

for img in $imgs; do sudo ./backdoor-image/backdoor-image -v --user=backdoor --password-auth --password=ubuntu $img done =-=-=-=-

ubuntu-server-builder commented 1 year ago

Launchpad user Mike Pontillo(mpontillo) wrote on 2015-10-19T20:11:03.356327+00:00

After looking at the code, I believe the commissioning is failing for a different reason. Reasoning:

(1) cloud-init does specifies "required=False" for the random_seed configuration option when calling pollinate [1]

(2) MAAS does not currently send the random_seed option when calling cloud-init.[2]

So while I think it's true that we could handle this better, I think the symptom in this bug may be a red herring.

[1]: https://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/view/head:/cloudinit/config/cc_seed_random.py

[2]: https://bazaar.launchpad.net/~maas-committers/maas/1.8/view/head:/src/maasserver/compose_preseed.py

ubuntu-server-builder commented 1 year ago

Launchpad user Mike Pontillo(mpontillo) wrote on 2015-10-19T20:14:27.208211+00:00

Actually, I'll go ahead and mark this "Triaged"; it is a real bug, it just isn't as critical as we assumed.

To fix this bug, we should configure cloud-init to NOT call pollinate during enlistment (to avoid this spurious error).

As a follow-on fix, it might be a good idea for cloud-init to fall back to 'insecure" mode (or simply use the public CA roots in /etc/ssl/certs rather than a pinned chain) and log this as a warning, if the pinned certificate could not be validated.

ubuntu-server-builder commented 1 year ago

Launchpad user Karl(karl-martin2) wrote on 2015-10-19T20:24:07.082794+00:00

To get this working in my environment are there any suggestions to move forward?

ubuntu-server-builder commented 1 year ago

Launchpad user Scott Moser(smoser) wrote on 2015-10-20T16:17:35.697744+00:00

well, this will probably mess things up, but i'll attempt to explain a few things. The summary is that I'm almost certain this does not affect your maas enlistment or commissioning.

a.) maas images in 'released' are old. this is quite unfortunate, but the images there are out of date and need updating. We're looking into ways we can produce up to date images without risk of regression to users.

since this is old, 'pollinate' inside is old. And since it uses its own certificate, that fails to work. Any unpatched ubuntu image will show that error. It is "just" a warning though. This is why updating to daily got rid of the red-herring problem for you.

b.) cloud-init really has nothing to do with pollinate. It calls it is all. MAAS can instruct it not to call pollinate, but that may defeat the purpose that pollinate is serving. Note, seed is sometimes believed to be more useful in VMs which have less entropy, and maas is targetting hardware. So, in the case where maas is pointed at "real hardware", disabling pollinate may be less harmful. (note, i'm not speaking as a qualified security engineer here).

c.) we should probably add to maas metadata service some random seed. This would alleviate 'b' as then we would be getting a random seed from somewhere.

d.) I've submitted merge proposal to document random_seed better at https://code.launchpad.net/~smoser/cloud-init/trunk.doc-seedrandom/+merge/275062

ubuntu-server-builder commented 1 year ago

Launchpad user Andres Rodriguez(andreserl) wrote on 2015-10-20T17:57:09.179036+00:00

Hi Karl,

In one of your comments i see "eventually I see connection failures to archive.ubuntu.com". That seems that the reason why it may be failing it is actually because it cnanot connect to the archive.

  1. are you sure maas-proxy is running ? logs are in /var/log/maas/proxy/*.log.
  2. The reason it may be failing to access the proxy is because of a missing upstream DNS> Have you set an upstream DNS?
ubuntu-server-builder commented 1 year ago

Launchpad user Andres Rodriguez(andreserl) wrote on 2017-11-02T21:39:02.925684+00:00

I believe that his is no longer an issue. I'm going to mark this bug as Invalid. Please re-open if you believe the issue still exists or file a new one.