ionos-cloud / docker-machine-driver

IONOS Cloud Docker Machine Driver
Apache License 2.0
6 stars 3 forks source link

Datacenter deleted by driver #6

Closed ando47 closed 3 years ago

ando47 commented 3 years ago

Description

I have been following the guide https://rancher.com/docs/rancher/v2.x/en/quick-start-guide/deployment/quickstart-manual-setup/. I added the rancher driver as a node driver. I created a node template which contained a datacenterID of an existing datacenter and afterwards cerated a cluster. Once i began the process, the datacenter was deleted.

Expected behavior

rancher worker nodes should have been created in the datacenter specified in the node template.

How to Reproduce

  1. Install rancher
  2. create node template
  3. create rancher cluster
ana-moga commented 3 years ago

Hello,

Thank you for opening this issue. The bug has been fixed in the latest version 5.0.2.

benschmi commented 3 years ago

Not fixed 5.0.2. Same behaviour as described by @ando47 . Looking at the commits for this version, I don't see how this could have been fixed. Please describe.

ana-moga commented 3 years ago

The commit Fix bug: Data Center deleted by the driver is included in 5.0.2. It synchronizes the location of the datacenter with the location of ipblock. If not synced from flags, it resulted in error + auto-replace on nodes (Rancher settings), it automatically deletes the existing worker node and creates another one.

I created a rancher cluster with v5.0.2 following the steps described here with an existing datacenter. If the cluster is Active and manually removed, the datacenter is not deleted.

To investigate more, please provide more details:

benschmi commented 3 years ago

Node Template settings:

Screenshot 2021-03-31 at 09 34 05

I assume an error occurs during the buildup of the cluster. But instead of deleting the individual component (master or worker) the whole data center is deleted. Rancher should NEVER be able to delete a data center. The data center ID is provided with the node template so it is preexisting. Therefore there is absolutely no use case for Rancher to ever delete the data center.

ana-moga commented 3 years ago

I was able to create an Active rancher cluster with the settings above - in the existing datacenter. Does the cluster have multiple nodes?

Please provide the docker container logs to be able to investigate.

benschmi commented 3 years ago
2021/03/30 12:56:54 [INFO] Download https://github.com/ionos-cloud/docker-machine-driver/releases/download/v5.0.2/docker-machine-driver-5.0.2-linux-386.tar.gz
...
2021/03/30 13:05:15 [INFO] Provisioning node worker1
2021/03/30 13:05:15 [INFO] Provisioning node master1
2021/03/30 13:05:15 [INFO] [node-controller-rancher-machine] Creating CA: /management-state/node/nodes/worker1/certs/ca.pem
2021/03/30 13:05:15 [INFO] [node-controller-rancher-machine] Creating CA: /management-state/node/nodes/master1/certs/ca.pem
2021/03/30 13:05:15 [INFO] [node-controller-rancher-machine] Creating client certificate: /management-state/node/nodes/master1/certs/cert.pem
2021/03/30 13:05:15 [INFO] [node-controller-rancher-machine] Creating client certificate: /management-state/node/nodes/worker1/certs/cert.pem
2021/03/30 13:05:15 [INFO] [node-controller-rancher-machine] Running pre-create checks...
2021/03/30 13:05:15 [INFO] [node-controller-rancher-machine] Running pre-create checks...
2021/03/30 13:05:16 [INFO] [node-controller-rancher-machine] (worker1) Creating machine under rancher node driver test datacenter
2021/03/30 13:05:16 [INFO] [node-controller-rancher-machine] Creating machine...
2021/03/30 13:05:16 [INFO] [node-controller-rancher-machine] (worker1) Creating SSH key...
2021/03/30 13:05:16 [INFO] [node-controller-rancher-machine] (master1) Creating machine under rancher node driver test datacenter
2021/03/30 13:05:16 [INFO] [node-controller-rancher-machine] Creating machine...
2021/03/30 13:05:16 [INFO] [node-controller-rancher-machine] (master1) Creating SSH key...
2021/03/30 13:05:18 [INFO] [node-controller-rancher-machine] (master1) LAN Created
2021/03/30 13:05:20 [INFO] [node-controller-rancher-machine] The default lines below are for a sh/bash shell, you can specify the shell you're using, with the --shell flag.
2021/03/30 13:05:20 [INFO] [node-controller-rancher-machine]
2021/03/30 13:05:20 [INFO] Generating and uploading node config worker1
2021/03/30 13:05:20 [ERROR] [node] enqueing node pool c-trklh:np-4pnlk
2021/03/30 13:05:20 [ERROR] error syncing 'c-trklh/m-lf986': handler node-controller: Error creating machine: Error in driver during machine creation: error creating ipblock: Post "https://api.io
nos.com/cloudapi/v5/ipblocks?depth=10": EOF, requeuing
2021/03/30 13:05:20 [INFO] Creating jail for c-trklh
2021/03/30 13:05:20 [INFO] Generating and uploading node config
2021/03/30 13:05:28 [INFO] [node-controller-rancher-machine] (master1) Server Created
2021/03/30 13:05:35 [ERROR] [node] enqueing node pool c-trklh:np-4pnlk
2021/03/30 13:05:38 [INFO] [node-controller-rancher-machine] (master1) Volume Attached to Server
2021/03/30 13:07:20 [INFO] [node-controller-rancher-machine] (master1) NIC Attached to Server
2021/03/30 13:07:20 [INFO] Removing node worker1
2021/03/30 13:07:21 [INFO] [node-controller-rancher-machine] About to remove worker1
2021/03/30 13:07:21 [INFO] [node-controller-rancher-machine] WARNING: This action will delete both local reference and remote instance.
2021/03/30 13:07:21 [INFO] [node-controller-rancher-machine] (worker1) Starting deleting resources...
2021/03/30 13:07:21 [INFO] Creating jail for c-trklh
2021/03/30 13:07:21 [INFO] Provisioning node worker2
2021/03/30 13:07:21 [INFO] [node-controller-rancher-machine] Creating CA: /management-state/node/nodes/worker2/certs/ca.pem
2021/03/30 13:07:21 [INFO] [node-controller-rancher-machine] Creating client certificate: /management-state/node/nodes/worker2/certs/cert.pem
2021/03/30 13:07:22 [INFO] [node-controller-rancher-machine] Running pre-create checks...
2021/03/30 13:07:23 [INFO] [node-controller-rancher-machine] (worker2) Creating machine under rancher node driver test datacenter
2021/03/30 13:07:23 [INFO] [node-controller-rancher-machine] Creating machine...
2021/03/30 13:07:23 [INFO] [node-controller-rancher-machine] (worker2) Creating SSH key...
2021/03/30 13:07:26 [INFO] [node-controller-rancher-machine] The default lines below are for a sh/bash shell, you can specify the shell you're using, with the --shell flag.
2021/03/30 13:07:26 [INFO] [node-controller-rancher-machine]
2021/03/30 13:07:26 [INFO] Generating and uploading node config worker2
2021/03/30 13:07:26 [ERROR] [node] enqueing node pool c-trklh:np-4pnlk
2021/03/30 13:07:26 [ERROR] error syncing 'c-trklh/m-tmj85': handler node-controller: Error creating machine: Error in driver during machine creation: error creating ipblock: Post "https://api.io
nos.com/cloudapi/v5/ipblocks?depth=10": EOF, requeuing
2021/03/30 13:07:41 [INFO] [node-controller-rancher-machine] (master1) 85.215.232.38
2021/03/30 13:07:41 [INFO] [node-controller-rancher-machine] Waiting for machine to be running, this may take a few minutes...
2021/03/30 13:08:03 [INFO] [node-controller-rancher-machine] (worker1) DataCenter Deleted
2021/03/30 13:08:03 [INFO] [node-controller-rancher-machine] (worker1) IPBlock Deleted
2021/03/30 13:08:03 [INFO] [node-controller-rancher-machine] Successfully removed worker1
2021/03/30 13:08:03 [WARNING] [node-controller-rancher-machine] Error removing host "worker1": 4 errors occurred:
2021/03/30 13:08:03 [WARNING] [node-controller-rancher-machine]         * error deleting NIC: 404 Not Found
2021/03/30 13:08:03 [WARNING] [node-controller-rancher-machine]         * error deleting volume: 405 Method Not Allowed
2021/03/30 13:08:03 [WARNING] [node-controller-rancher-machine]         * error deleting server: Delete "https://api.ionos.com/cloudapi/v5/datacenters/0b76ed72-f850-4eec-86ce-212b3cce2144/servers
/?depth=10": EOF
2021/03/30 13:08:03 [WARNING] [node-controller-rancher-machine]         * error deleting LAN: 405 Method Not Allowed
2021/03/30 13:08:03 [WARNING] [node-controller-rancher-machine]
2021/03/30 13:08:03 [WARNING] [node-controller-rancher-machine]
2021/03/30 13:08:03 [INFO] Removing node worker1 done
2021/03/30 13:10:03 [INFO] Removing node worker2
2021/03/30 13:10:03 [INFO] [node-controller-rancher-machine] About to remove worker2
2021/03/30 13:10:03 [INFO] [node-controller-rancher-machine] WARNING: This action will delete both local reference and remote instance.
2021/03/30 13:10:03 [INFO] [node-controller-rancher-machine] (worker2) Starting deleting resources...
2021/03/30 13:10:03 [INFO] Creating jail for c-trklh
2021/03/30 13:10:03 [INFO] Provisioning node worker1
2021/03/30 13:10:03 [INFO] [node-controller-rancher-machine] Creating CA: /management-state/node/nodes/worker1/certs/ca.pem
2021/03/30 13:10:03 [INFO] [node-controller-rancher-machine] Creating client certificate: /management-state/node/nodes/worker1/certs/cert.pem
2021/03/30 13:10:03 [INFO] [node-controller-rancher-machine] Running pre-create checks...
2021/03/30 13:10:04 [INFO] [node-controller-rancher-machine] (worker2) IPBlock Deleted
2021/03/30 13:10:04 [INFO] [node-controller-rancher-machine] Successfully removed worker2
2021/03/30 13:10:04 [WARNING] [node-controller-rancher-machine] Error removing host "worker2": 5 errors occurred:
2021/03/30 13:10:04 [WARNING] [node-controller-rancher-machine]         * error deleting NIC: 404 Not Found
2021/03/30 13:10:04 [WARNING] [node-controller-rancher-machine]         * error deleting volume: 405 Method Not Allowed
2021/03/30 13:10:04 [WARNING] [node-controller-rancher-machine]         * error deleting server: 405 Method Not Allowed
2021/03/30 13:10:04 [WARNING] [node-controller-rancher-machine]         * error deleting LAN: 405 Method Not Allowed
2021/03/30 13:10:04 [WARNING] [node-controller-rancher-machine]         * error deleting datacenter: 404 Not Found
2021/03/30 13:10:04 [WARNING] [node-controller-rancher-machine]
2021/03/30 13:10:04 [WARNING] [node-controller-rancher-machine]
2021/03/30 13:10:04 [INFO] Removing node worker2 done
2021/03/30 13:10:06 [INFO] [node-controller-rancher-machine] The default lines below are for a sh/bash shell, you can specify the shell you're using, with the --shell flag.
2021/03/30 13:10:06 [INFO] [node-controller-rancher-machine]
2021/03/30 13:10:06 [INFO] Generating and uploading node config worker1
2021/03/30 13:10:21 [ERROR] [node] enqueing node pool c-trklh:np-4pnlk
2021/03/30 13:10:21 [ERROR] error syncing 'c-trklh/m-fqqvt': handler node-controller: Error with pre-create check: "error getting datacenter: 404 Not Found", requeuing
2021/03/30 13:10:49 [INFO] [node-controller-rancher-machine] The default lines below are for a sh/bash shell, you can specify the shell you're using, with the --shell flag.
2021/03/30 13:10:49 [INFO] [node-controller-rancher-machine]
2021/03/30 13:10:49 [INFO] Generating and uploading node config master1
2021/03/30 13:10:49 [ERROR] [node] enqueing node pool c-trklh:np-jpl42
2021/03/30 13:10:49 [ERROR] error syncing 'c-trklh/m-wjrzd': handler node-controller: Error creating machine: Error waiting for machine to be running: Maximum number of retries (60) exceeded, req
ueuing
2021/03/30 13:12:21 [INFO] Creating jail for c-trklh
2021/03/30 13:12:21 [INFO] Provisioning node worker2
2021/03/30 13:12:21 [INFO] [node-controller-rancher-machine] Creating CA: /management-state/node/nodes/worker2/certs/ca.pem
2021/03/30 13:12:21 [INFO] [node-controller-rancher-machine] Creating client certificate: /management-state/node/nodes/worker2/certs/cert.pem
2021/03/30 13:12:22 [INFO] [node-controller-rancher-machine] Running pre-create checks...
2021/03/30 13:12:24 [INFO] [node-controller-rancher-machine] The default lines below are for a sh/bash shell, you can specify the shell you're using, with the --shell flag.
2021/03/30 13:12:24 [INFO] [node-controller-rancher-machine]
2021/03/30 13:12:24 [INFO] Generating and uploading node config worker2
2021/03/30 13:12:39 [ERROR] [node] enqueing node pool c-trklh:np-4pnlk
2021/03/30 13:12:39 [ERROR] error syncing 'c-trklh/m-zx8l4': handler node-controller: Error with pre-create check: "error getting datacenter: 404 Not Found", requeuing
2021/03/30 13:14:21 [INFO] Removing node master1
2021/03/30 13:14:21 [INFO] [node-controller-rancher-machine] About to remove master1
2021/03/30 13:14:21 [INFO] [node-controller-rancher-machine] WARNING: This action will delete both local reference and remote instance.
2021/03/30 13:14:21 [INFO] [node-controller-rancher-machine] (master1) Starting deleting resources...
2021/03/30 13:14:21 [INFO] Creating jail for c-trklh
2021/03/30 13:14:21 [INFO] Provisioning node master2
2021/03/30 13:14:21 [INFO] [node-controller-rancher-machine] Creating CA: /management-state/node/nodes/master2/certs/ca.pem
2021/03/30 13:14:21 [INFO] [node-controller-rancher-machine] Creating client certificate: /management-state/node/nodes/master2/certs/cert.pem
2021/03/30 13:14:21 [INFO] [node-controller-rancher-machine] Running pre-create checks...
2021/03/30 13:14:22 [INFO] [node-controller-rancher-machine] (master1) IPBlock Deleted
2021/03/30 13:14:22 [INFO] [node-controller-rancher-machine] Successfully removed master1
2021/03/30 13:14:22 [WARNING] [node-controller-rancher-machine] Error removing host "master1": 4 errors occurred:
2021/03/30 13:14:22 [WARNING] [node-controller-rancher-machine]         * error deleting NIC: 404 Not Found
2021/03/30 13:14:22 [WARNING] [node-controller-rancher-machine]         * error deleting volume: 404 Not Found
2021/03/30 13:14:22 [WARNING] [node-controller-rancher-machine]         * error deleting server: 404 Not Found
2021/03/30 13:14:22 [WARNING] [node-controller-rancher-machine]         * error deleting LAN: 404 Not Found
2021/03/30 13:14:22 [WARNING] [node-controller-rancher-machine]
2021/03/30 13:14:22 [WARNING] [node-controller-rancher-machine]
2021/03/30 13:14:22 [INFO] Removing node master1 done
2021/03/30 13:14:24 [INFO] [node-controller-rancher-machine] The default lines below are for a sh/bash shell, you can specify the shell you're using, with the --shell flag.
2021/03/30 13:14:24 [INFO] [node-controller-rancher-machine]
2021/03/30 13:14:24 [INFO] Generating and uploading node config master2
2021/03/30 13:14:39 [ERROR] [node] enqueing node pool c-trklh:np-jpl42
2021/03/30 13:14:39 [ERROR] error syncing 'c-trklh/m-n6ns6': handler node-controller: Error with pre-create check: "error getting datacenter: 404 Not Found", requeuing
ana-moga commented 3 years ago

Just released 5.0.3 - which includes changes so Rancher does not delete an existing Data Center. Thanks for pointing this out.

benschmi commented 3 years ago

Tested successfully. 5.0.3 fixes the issue. Thank you very much! :)