turtle0x1 / LxdMosaic

Web interface to manage multiple instance of lxd
http://lxdmosaic.com
GNU General Public License v3.0
597 stars 61 forks source link

Centos 7 or Curl 7.29.0 breaks lxd mosaic #84

Closed trenb closed 5 years ago

trenb commented 5 years ago

Describe the bug Added a second LXD host to interface

To Reproduce Steps to reproduce the behavior:

  1. Add a second host to LXD mosaic

Screenshots

Screen Shot 2019-05-07 at 11 57 10 AM

This is reproducible for me. If I look at the pm2 logs, all I see is this:

{"host":"https:\/\/lxd-a01.XXXX.lan:8443","offline":true}
{"host":"https:\/\/lxd-a01.XXXX.lan:8443","offline":false}

And there's no mention of the second host I added. Looking at the mysql database, both hosts are entered correctly.

root@localhost:LXD_Manager> select * from Hosts;
+---------+-------------------------------+---------------------------+-----------------------+----------------------+-------------+------------+
| Host_ID | Host_Url_And_Port             | Host_Cert_Path            | Host_Cert_Only_File   | Host_Key_File        | Host_Online | Host_Alias |
+---------+-------------------------------+---------------------------+-----------------------+----------------------+-------------+------------+
|       1 | https://lxd-a01.XXXX.lan:8443 | lxd-a01.XXXX.lan.combined | lxd-a01.XXXX.lan.cert | lxd-a01.airg.lan.key |           1 | NULL       |
|       2 | https://lxd-b01.XXXX.lan:8443 | lxd-b01.XXXX.lan.combined | lxd-b01.XXXX.lan.cert | lxd-b01.airg.lan.key |           1 | NULL       |
+---------+-------------------------------+---------------------------+-----------------------+----------------------+-------------+------------+

I ran into this issue on a previous version of this software, but with the latest version, I'm still having this problem. Any suggestions on debugging this further?

trenb commented 5 years ago

And if I remove the second host I added, the interface works fine again:

root@localhost:LXD_Manager> delete from Hosts where Host_ID=2;
Query OK, 1 row affected (0.002 sec)
turtle0x1 commented 5 years ago

Can you refresh the page with dev tools open and show me the result of the request to GetOverviewController I haven't got my other server online right now

Also are both hosts online ?

trenb commented 5 years ago

Actually, I have a bit more debug for you. If I try to run the fleet analytics cron, I get the following error:

# php /var/www/LxdMosaic/src/cronJobs/fleetAnalytics.php
PHP Fatal error:  Uncaught GuzzleHttp\Exception\ClientException: Client error: `GET https://lxd-b01.XXXX.lan:8443/1.0/storage-pools` resulted in a `403 Forbidden` response:
{"error":"not authorized","error_code":403,"type":"error"}

 in /var/www/LxdMosaic/vendor/guzzlehttp/guzzle/src/Exception/RequestException.php:113
Stack trace:
#0 /var/www/LxdMosaic/vendor/guzzlehttp/guzzle/src/Middleware.php(66): GuzzleHttp\Exception\RequestException::create(Object(GuzzleHttp\Psr7\Request), Object(GuzzleHttp\Psr7\Response))
#1 /var/www/LxdMosaic/vendor/guzzlehttp/promises/src/Promise.php(203): GuzzleHttp\Middleware::GuzzleHttp\{closure}(Object(GuzzleHttp\Psr7\Response))
#2 /var/www/LxdMosaic/vendor/guzzlehttp/promises/src/Promise.php(156): GuzzleHttp\Promise\Promise::callHandler(1, Object(GuzzleHttp\Psr7\Response), Array)
#3 /var/www/LxdMosaic/vendor/guzzlehttp/promises/src/TaskQueue.php(47): GuzzleHttp\Promise\Promise::GuzzleHttp\Promise\{closure}()
#4 /var/www/LxdMosaic/vendor/guzzlehttp/guzzle/src/Handler/CurlMultiHandler.php(98): Guzz in /var/www/LxdMosaic/vendor/dhope0000/lxd/src/HttpClient/Plugin/LxdExceptionThower.php on line 38

I'm using the correct password. And yes, both hosts are online.

turtle0x1 commented 5 years ago

Interesting what version of LXD are the hosts using ?

Have you modified any certificates locations etc ?

Im going boot up my other server now

trenb commented 5 years ago

LXD 3.11 on both hosts. And no, I haven't modified any of the certs. All I have to do to break things is just add a second LXD host.

trenb commented 5 years ago

Sorry, how do I get you the results of GetOverviewController? I'm not an expert in the dev console in Chrome.

trenb commented 5 years ago

As well, as soon as I remove the second host, fleetAnalytics.php no longer errors out.

turtle0x1 commented 5 years ago

Hey so I can't duplicate this, can you open developer tools in your browser (shift + ctrl + i), click network, and show me the response for GetOverviewController (you can filter by that and it will bring up the request :)

Also on lxd-b01 can you run sudo lxc config trust list and check the certificate was added

trenb commented 5 years ago

Here's the response for GetOverviewController when I add the second host:

{"state":"error","message":"LXD client certificate is not trusted."}
trenb commented 5 years ago

Here's the config trust for lxd-a01:

+--------------+-------------+-----------------------------+-----------------------------+
| FINGERPRINT  | COMMON NAME |         ISSUE DATE          |         EXPIRY DATE         |
+--------------+-------------+-----------------------------+-----------------------------+
| 7869ee9c30bd | 127.0.0.1   | May 7, 2019 at 7:06pm (UTC) | May 6, 2020 at 7:06pm (UTC) |
+--------------+-------------+-----------------------------+-----------------------------+
| ab6e8136a784 | 127.0.0.1   | May 7, 2019 at 7:27pm (UTC) | May 6, 2020 at 7:27pm (UTC) |
+--------------+-------------+-----------------------------+-----------------------------+
| b1db2ca72369 | 127.0.0.1   | May 6, 2019 at 3:45pm (UTC) | May 5, 2020 at 3:45pm (UTC) |
+--------------+-------------+-----------------------------+-----------------------------+

and for b01:

+--------------+-----------------------+-----------------------------+-----------------------------+
| FINGERPRINT  |      COMMON NAME      |         ISSUE DATE          |         EXPIRY DATE         |
+--------------+-----------------------+-----------------------------+-----------------------------+
| 86e21c088e70 | 127.0.0.1             | May 7, 2019 at 7:05pm (UTC) | May 6, 2020 at 7:05pm (UTC) |
+--------------+-----------------------+-----------------------------+-----------------------------+
| 8d2eb7bb6ac0 | 127.0.0.1             | May 7, 2019 at 6:54pm (UTC) | May 6, 2020 at 6:54pm (UTC) |
+--------------+-----------------------+-----------------------------+-----------------------------+
| c2802fd2ff69 | root@lxd-a01.XXXX.lan | May 7, 2019 at 6:01pm (UTC) | May 4, 2029 at 6:01pm (UTC) |
+--------------+-----------------------+-----------------------------+-----------------------------+
trenb commented 5 years ago

So why is adding a host, not trusting the certificate?

turtle0x1 commented 5 years ago

Can you try adding the host by IP instead of dns name

trenb commented 5 years ago

Adding via IP breaks exactly the same way. And if I add either one of these on their own, everything works. It's only when I have more than 1 host defined does this break.

trenb commented 5 years ago

It looks like the certs are being copied to the mosaic host:

# ls -la
total 56
drwxr-xr-x 2 apache apache 4096 May  7 19:36 .
drwxr-xr-x 3 root   root   4096 May  6 14:39 ..
-rw-r--r-- 1 apache apache 1452 May  7 19:36 192.168.100.221.cert
-rw-r--r-- 1 apache apache 3156 May  7 19:36 192.168.100.221.combined
-rw-r--r-- 1 apache apache 1704 May  7 19:36 192.168.100.221.key
-rw-r--r-- 1 apache apache 1452 May  7 19:35 192.168.107.209.cert
-rw-r--r-- 1 apache apache 3156 May  7 19:35 192.168.107.209.combined
-rw-r--r-- 1 apache apache 1704 May  7 19:35 192.168.107.209.key
-rw-r--r-- 1 apache apache 1452 May  7 19:27 lxd-a01.XXXX.lan.cert
-rw-r--r-- 1 apache apache 3156 May  7 19:27 lxd-a01.XXXX.lan.combined
-rw-r--r-- 1 apache apache 1704 May  7 19:27 lxd-a01.XXXX.lan.key
-rw-r--r-- 1 apache apache 1452 May  7 19:05 lxd-b01.XXXX.lan.cert
-rw-r--r-- 1 apache apache 3156 May  7 19:05 lxd-b01.XXXX.lan.combined
-rw-r--r-- 1 apache apache 1704 May  7 19:05 lxd-b01.XXXX.lan.key
turtle0x1 commented 5 years ago

LxdMosaic generates the certificate and deploys it to the lxd instance,

this is going sound silly but are you sure there is no way the DNS could accidentally route the wrong host ? Its proving difficult to replicate this,

When you did it by IP did you clear out the hosts table completely ?

trenb commented 5 years ago

No, DNS is functioning correctly here. Like I said, I can add either host INDIVIDUALLY, and the interface works. As soon as I add a second host, the interface breaks.

And yes, I cleared out the Hosts table in MySQL before re-adding the hosts as IPs.

turtle0x1 commented 5 years ago

Can edit src/classes/Model/Client/LxdClient.php and completely remove the lines

        if ($checkCache && isset($this->clientBag[$hostDetails["Host_Url_And_Port"]])) {
            return $this->clientBag[$hostDetails["Host_Url_And_Port"]];
        }
trenb commented 5 years ago

No difference.

turtle0x1 commented 5 years ago

Can you try

curl --cert lxd-b01.XXXX.lan.combined -skv https://lxd-b01.XXXX.lan/1.0/storage-pools

trenb commented 5 years ago

Okay, so this is a bit strange. After updating the URL to specify port 8443 (default LXD port), I get not authorized:

# curl --cert lxd-b01.XXXX.lan.combined -skv https://lxd-b01.XXXX.lan:8443/1.0/storage-pools
* About to connect() to lxd-b01.XXXX.lan port 8443 (#0)
*   Trying 192.168.100.221...
* Connected to lxd-b01.XXXX.lan (192.168.100.221) port 8443 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
* warning: certificate file name "lxd-b01.XXXX.lan.combined" handled as nickname; please use "./lxd-b01.XXXX.lan.combined" to force file name
* skipping SSL peer certificate verification
* NSS: client certificate not found: lxd-b01.XXXX.lan.combined
* SSL connection using TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
* Server certificate:
*       subject: CN=root@lxd-b01.XXXX.lan,O=linuxcontainers.org
*       start date: May 07 16:57:42 2019 GMT
*       expire date: May 04 16:57:42 2029 GMT
*       common name: root@lxd-b01.XXXX.lan
*       issuer: CN=root@lxd-b01.XXXX.lan,O=linuxcontainers.org
> GET /1.0/storage-pools HTTP/1.1
> User-Agent: curl/7.29.0
> Host: lxd-b01.XXXX.lan:8443
> Accept: */*
>
< HTTP/1.1 403 Forbidden
< Content-Type: application/json
< X-Content-Type-Options: nosniff
< Date: Tue, 07 May 2019 20:01:06 GMT
< Content-Length: 60
<
{"error":"not authorized","error_code":403,"type":"error"}

But if I specify the path for the certificate it works:

# curl --cert ./lxd-b01.XXXX.lan.combined -skv https://lxd-b01.XXXX.lan:8443/1.0/storage-pools
* About to connect() to lxd-b01.XXXX.lan port 8443 (#0)
*   Trying 192.168.100.221...
* Connected to lxd-b01.XXXX.lan (192.168.100.221) port 8443 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
* skipping SSL peer certificate verification
* NSS: client certificate from file
*       subject: E=info@opensauce.systems,CN=127.0.0.1,OU=Dev,O=Open Sauce Systems,L=Cowes,ST=Isle Of Wight,C=UK
*       start date: May 07 19:05:12 2019 GMT
*       expire date: May 06 19:05:12 2020 GMT
*       common name: 127.0.0.1
*       issuer: E=info@opensauce.systems,CN=127.0.0.1,OU=Dev,O=Open Sauce Systems,L=Cowes,ST=Isle Of Wight,C=UK
* SSL connection using TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
* Server certificate:
*       subject: CN=root@lxd-b01.XXXX.lan,O=linuxcontainers.org
*       start date: May 07 16:57:42 2019 GMT
*       expire date: May 04 16:57:42 2029 GMT
*       common name: root@lxd-b01.XXXX.lan
*       issuer: CN=root@lxd-b01.XXXX.lan,O=linuxcontainers.org
> GET /1.0/storage-pools HTTP/1.1
> User-Agent: curl/7.29.0
> Host: lxd-b01.XXXX.lan:8443
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: application/json
< Date: Tue, 07 May 2019 20:03:41 GMT
< Content-Length: 132
<
{"type":"sync","status":"Success","status_code":200,"operation":"","error_code":0,"error":"","metadata":["/1.0/storage-pools/zfs"]}
trenb commented 5 years ago

This also works when I use IP:

# curl --cert ./192.168.100.221.combined -skv https://192.168.100.221:8443/1.0/storage-pools
* About to connect() to 192.168.100.221 port 8443 (#0)
*   Trying 192.168.100.221...
* Connected to 192.168.100.221 (192.168.100.221) port 8443 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
* skipping SSL peer certificate verification
* NSS: client certificate from file
*       subject: E=info@opensauce.systems,CN=127.0.0.1,OU=Dev,O=Open Sauce Systems,L=Cowes,ST=Isle Of Wight,C=UK
*       start date: May 07 19:36:15 2019 GMT
*       expire date: May 06 19:36:15 2020 GMT
*       common name: 127.0.0.1
*       issuer: E=info@opensauce.systems,CN=127.0.0.1,OU=Dev,O=Open Sauce Systems,L=Cowes,ST=Isle Of Wight,C=UK
* SSL connection using TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
* Server certificate:
*       subject: CN=root@lxd-b01.XXXX.lan,O=linuxcontainers.org
*       start date: May 07 16:57:42 2019 GMT
*       expire date: May 04 16:57:42 2029 GMT
*       common name: root@lxd-b01.XXXX.lan
*       issuer: CN=root@lxd-b01.XXXX.lan,O=linuxcontainers.org
> GET /1.0/storage-pools HTTP/1.1
> User-Agent: curl/7.29.0
> Host: 192.168.100.221:8443
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: application/json
< Date: Tue, 07 May 2019 20:15:07 GMT
< Content-Length: 132
<
{"type":"sync","status":"Success","status_code":200,"operation":"","error_code":0,"error":"","metadata":["/1.0/storage-pools/zfs"]}
turtle0x1 commented 5 years ago

I temporarily replicated this but can't do it again still trying

trenb commented 5 years ago

Let me know if you need any more information!

turtle0x1 commented 5 years ago

See I the only way I could duplicate this is with a DNS (or conflicting ip) error no amount of killing hosts / combinations of domain names could break it , are you 100% sure that the ips don't conflict or some dns cache causes the servers to be confused ?

Because its adding the host successfully and then trying the wrong server with the wrong certificate,

Could you change src/classes/Model/Client/LxdClient.php and replace the contents of getANewClient so it looks like:

    public function getANewClient($hostId, $checkCache = true, $setProject = true)
    {
        $hostDetails = $this->getDetails->getAll($hostId);

        if (empty($hostDetails)) {
            throw new \Exception("Couldn't find info for this host", 1);
        }

        if ($checkCache && isset($this->clientBag[$hostDetails["Host_Url_And_Port"]])) {
            return $this->clientBag[$hostDetails["Host_Url_And_Port"]];
        }

        $certPath = $this->createFullcertPath($hostDetails["Host_Cert_Path"]);
        $config = $this->createConfigArray($certPath);
        var_dump($config);
        var_dump($hostDetails["Host_Url_And_Port"]);
        $client = $this->createNewClient($hostDetails["Host_Url_And_Port"], $config);
        $client->setProject($this->session->get("host/$hostId/project", "default"));
        return $client;
    }

Then refresh the page and from network tools again show the contents of the response for GetOverviewController

trenb commented 5 years ago

Here you go!

array(2) {
  ["verify"]=>
  bool(false)
  ["cert"]=>
  array(2) {
    [0]=>
    string(67) "/var/www/LxdMosaic/src/sensitiveData/certs/192.168.100.221.combined"
    [1]=>
    string(0) ""
  }
}
string(28) "https://192.168.100.221:8443"
array(2) {
  ["verify"]=>
  bool(false)
  ["cert"]=>
  array(2) {
    [0]=>
    string(67) "/var/www/LxdMosaic/src/sensitiveData/certs/192.168.107.209.combined"
    [1]=>
    string(0) ""
  }
}
string(28) "https://192.168.107.209:8443"
{"state":"error","message":"LXD client certificate is not trusted."}
trenb commented 5 years ago

And no, there's no conflicting IPs. These hosts were just built in the past week. The IPs are unique. As are the DNS names. The fact it doesn't work via IP or DNS seems to rule out DNS as the issue. As well, either host when added first, works fine. Things only break once a second host is entered.

turtle0x1 commented 5 years ago

Can we arrange privately to have me dial into your network (perhaps team viewer?) in the interest in tracking down this bug and improving the library ?

trenb commented 5 years ago

No, but I can give you access to my private server where I have the same problem. :)

turtle0x1 commented 5 years ago

Cool either works can you private message me on https://discuss.linuxcontainers.org/ with connection details ?

trenb commented 5 years ago

I'm feeling stupid, but I can't figure out where to send a private message. I've created an account. Can you point me in the right direction please? :)

I can view messages, but I have no option to send them.

turtle0x1 commented 5 years ago

If you click your user icon, then click the envelope, the click new message and my username is turtle0x1 :)

turtle0x1 commented 5 years ago

Ah you might not have permissions to as your a new user can you email me on goaway321@protonmail.com

trenb commented 5 years ago

Got it! email sent!

turtle0x1 commented 5 years ago

Its a curl bug with centos or curl version 7.29.0 (or maybe even php) but closing the connection fixes the issue,

Apply the fix to src/classes/Model/Client/LxdClient.php and change createConfigArray to look like this as a temporary work around

    public function createConfigArray($certLocation)
    {
        $certPath = realpath($certLocation);

        if ($certPath === false) {
            throw new \Exception("Certificate has gone walk abouts", 1);
        }

    return [
            'verify' => false,
            'cert' => [
                $certPath,
                ''
            ],
            'headers' => ['Connection' => 'close'], // Or simply add this to the request object
        ];
    }
trenb commented 5 years ago

I copied the file from my system and it works. I must be tired, because this diff doesn't make sense to me why this wouldn't be working with my file.

diff LxdClient.php LxdClient.php.TREN

34a35,36

    var_dump($config);
    var_dump($hostDetails["Host_Url_And_Port"]);

59,63c61 < // 'debug'=>true, < // 'Connection' => 'close', < // CURLOPT_FORBID_REUSE => true, < // CURLOPT_FRESH_CONNECT => true, < 'headers' => ['Connection' => 'close'], // Or simply add this to the request object

    'headers' => ['Connection' => 'close'], // Or simply add this to

the request object

On May 7, 2019 at 3:28:44 PM, turtle0x1 (notifications@github.com) wrote:

Its a curl bug with centos or curl version 7.29.0 (or maybe even php) but closing the connection fixes the issue,

Apply the fix to src/classes/Model/Client/LxdClient.php and change createConfigArray to look like this as a temporary work around

public function createConfigArray($certLocation)    {

$certPath = realpath($certLocation); if ($certPath === false) { throw new \Exception("Certificate has gone walk abouts", 1); } return [ 'verify' => false, 'cert' => [ $certPath, '' ], 'headers' => ['Connection' => 'close'], // Or simply add this to the request object ]; }

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/turtle0x1/LxdMosaic/issues/84#issuecomment-490277817, or mute the thread https://github.com/notifications/unsubscribe-auth/AA73OYXOHGYMABLCZ3BMU7DPUH7BZANCNFSM4HLLUFHQ .

turtle0x1 commented 5 years ago

any var_dumps or any of the debugging steps?

trenb commented 5 years ago

I'm going to try updating curl tomorrow. Thanks for your help today! I'll update the ticket with the results of my test!

turtle0x1 commented 5 years ago

Remove the var_dumps() thats why its failing

trenb commented 5 years ago

I upgraded to curl 7.64.1 (and supporting libraries) with no change. The patch is still required. Is there any negative to this patch being applied to Ubuntu as well as CentOS?

turtle0x1 commented 5 years ago

potentially much slower access ill work up a patch tonight, will you be providing a centos install script? if not its a pretty risky patch (as im sure a small number of people will mannually install this on centos) and i cant test it without a centos install script

trenb commented 5 years ago

Gotcha. Let me work on that then.

trenb commented 5 years ago

Alright, I forked your repo and created the centos7 install script. I do not have your patch installed to fix the issue with multiple hosts.

Please clone https://github.com/trenb/LxdMosaic and use examples/install_with_clone_centos7.sh - I spun up my CentOS 7 container via lxc launch images:centos/7/amd64 testmosaic1, curl the file and run it via bash.

Once you're happy with it, I will change the script to call your repo for the git clone and send a pull request.

turtle0x1 commented 5 years ago

okay downloading the image now, ive thought of how to selectively apply the fix to centos only

turtle0x1 commented 5 years ago

Can you open a PR so i can comment please

trenb commented 5 years ago

https://github.com/turtle0x1/LxdMosaic/pull/85 created

turtle0x1 commented 5 years ago

I have pushed a fix for this would you mind testing it ?

trenb commented 5 years ago

I reverted my LxdClient.php and pulled. I can confirm that things are working. Nice fix! Thank you!