Closed ando47 closed 3 years ago
Hello,
Thank you for opening this issue. The bug has been fixed in the latest version 5.0.2.
Not fixed 5.0.2. Same behaviour as described by @ando47 . Looking at the commits for this version, I don't see how this could have been fixed. Please describe.
The commit Fix bug: Data Center deleted by the driver
is included in 5.0.2. It synchronizes the location of the datacenter with the location of ipblock. If not synced from flags, it resulted in error + auto-replace on nodes (Rancher settings), it automatically deletes the existing worker node and creates another one.
I created a rancher cluster with v5.0.2 following the steps described here with an existing datacenter. If the cluster is Active and manually removed, the datacenter is not deleted.
To investigate more, please provide more details:
Node Template settings:
I assume an error occurs during the buildup of the cluster. But instead of deleting the individual component (master or worker) the whole data center is deleted. Rancher should NEVER be able to delete a data center. The data center ID is provided with the node template so it is preexisting. Therefore there is absolutely no use case for Rancher to ever delete the data center.
I was able to create an Active rancher cluster with the settings above - in the existing datacenter. Does the cluster have multiple nodes?
Please provide the docker container logs to be able to investigate.
2021/03/30 12:56:54 [INFO] Download https://github.com/ionos-cloud/docker-machine-driver/releases/download/v5.0.2/docker-machine-driver-5.0.2-linux-386.tar.gz
...
2021/03/30 13:05:15 [INFO] Provisioning node worker1
2021/03/30 13:05:15 [INFO] Provisioning node master1
2021/03/30 13:05:15 [INFO] [node-controller-rancher-machine] Creating CA: /management-state/node/nodes/worker1/certs/ca.pem
2021/03/30 13:05:15 [INFO] [node-controller-rancher-machine] Creating CA: /management-state/node/nodes/master1/certs/ca.pem
2021/03/30 13:05:15 [INFO] [node-controller-rancher-machine] Creating client certificate: /management-state/node/nodes/master1/certs/cert.pem
2021/03/30 13:05:15 [INFO] [node-controller-rancher-machine] Creating client certificate: /management-state/node/nodes/worker1/certs/cert.pem
2021/03/30 13:05:15 [INFO] [node-controller-rancher-machine] Running pre-create checks...
2021/03/30 13:05:15 [INFO] [node-controller-rancher-machine] Running pre-create checks...
2021/03/30 13:05:16 [INFO] [node-controller-rancher-machine] (worker1) Creating machine under rancher node driver test datacenter
2021/03/30 13:05:16 [INFO] [node-controller-rancher-machine] Creating machine...
2021/03/30 13:05:16 [INFO] [node-controller-rancher-machine] (worker1) Creating SSH key...
2021/03/30 13:05:16 [INFO] [node-controller-rancher-machine] (master1) Creating machine under rancher node driver test datacenter
2021/03/30 13:05:16 [INFO] [node-controller-rancher-machine] Creating machine...
2021/03/30 13:05:16 [INFO] [node-controller-rancher-machine] (master1) Creating SSH key...
2021/03/30 13:05:18 [INFO] [node-controller-rancher-machine] (master1) LAN Created
2021/03/30 13:05:20 [INFO] [node-controller-rancher-machine] The default lines below are for a sh/bash shell, you can specify the shell you're using, with the --shell flag.
2021/03/30 13:05:20 [INFO] [node-controller-rancher-machine]
2021/03/30 13:05:20 [INFO] Generating and uploading node config worker1
2021/03/30 13:05:20 [ERROR] [node] enqueing node pool c-trklh:np-4pnlk
2021/03/30 13:05:20 [ERROR] error syncing 'c-trklh/m-lf986': handler node-controller: Error creating machine: Error in driver during machine creation: error creating ipblock: Post "https://api.io
nos.com/cloudapi/v5/ipblocks?depth=10": EOF, requeuing
2021/03/30 13:05:20 [INFO] Creating jail for c-trklh
2021/03/30 13:05:20 [INFO] Generating and uploading node config
2021/03/30 13:05:28 [INFO] [node-controller-rancher-machine] (master1) Server Created
2021/03/30 13:05:35 [ERROR] [node] enqueing node pool c-trklh:np-4pnlk
2021/03/30 13:05:38 [INFO] [node-controller-rancher-machine] (master1) Volume Attached to Server
2021/03/30 13:07:20 [INFO] [node-controller-rancher-machine] (master1) NIC Attached to Server
2021/03/30 13:07:20 [INFO] Removing node worker1
2021/03/30 13:07:21 [INFO] [node-controller-rancher-machine] About to remove worker1
2021/03/30 13:07:21 [INFO] [node-controller-rancher-machine] WARNING: This action will delete both local reference and remote instance.
2021/03/30 13:07:21 [INFO] [node-controller-rancher-machine] (worker1) Starting deleting resources...
2021/03/30 13:07:21 [INFO] Creating jail for c-trklh
2021/03/30 13:07:21 [INFO] Provisioning node worker2
2021/03/30 13:07:21 [INFO] [node-controller-rancher-machine] Creating CA: /management-state/node/nodes/worker2/certs/ca.pem
2021/03/30 13:07:21 [INFO] [node-controller-rancher-machine] Creating client certificate: /management-state/node/nodes/worker2/certs/cert.pem
2021/03/30 13:07:22 [INFO] [node-controller-rancher-machine] Running pre-create checks...
2021/03/30 13:07:23 [INFO] [node-controller-rancher-machine] (worker2) Creating machine under rancher node driver test datacenter
2021/03/30 13:07:23 [INFO] [node-controller-rancher-machine] Creating machine...
2021/03/30 13:07:23 [INFO] [node-controller-rancher-machine] (worker2) Creating SSH key...
2021/03/30 13:07:26 [INFO] [node-controller-rancher-machine] The default lines below are for a sh/bash shell, you can specify the shell you're using, with the --shell flag.
2021/03/30 13:07:26 [INFO] [node-controller-rancher-machine]
2021/03/30 13:07:26 [INFO] Generating and uploading node config worker2
2021/03/30 13:07:26 [ERROR] [node] enqueing node pool c-trklh:np-4pnlk
2021/03/30 13:07:26 [ERROR] error syncing 'c-trklh/m-tmj85': handler node-controller: Error creating machine: Error in driver during machine creation: error creating ipblock: Post "https://api.io
nos.com/cloudapi/v5/ipblocks?depth=10": EOF, requeuing
2021/03/30 13:07:41 [INFO] [node-controller-rancher-machine] (master1) 85.215.232.38
2021/03/30 13:07:41 [INFO] [node-controller-rancher-machine] Waiting for machine to be running, this may take a few minutes...
2021/03/30 13:08:03 [INFO] [node-controller-rancher-machine] (worker1) DataCenter Deleted
2021/03/30 13:08:03 [INFO] [node-controller-rancher-machine] (worker1) IPBlock Deleted
2021/03/30 13:08:03 [INFO] [node-controller-rancher-machine] Successfully removed worker1
2021/03/30 13:08:03 [WARNING] [node-controller-rancher-machine] Error removing host "worker1": 4 errors occurred:
2021/03/30 13:08:03 [WARNING] [node-controller-rancher-machine] * error deleting NIC: 404 Not Found
2021/03/30 13:08:03 [WARNING] [node-controller-rancher-machine] * error deleting volume: 405 Method Not Allowed
2021/03/30 13:08:03 [WARNING] [node-controller-rancher-machine] * error deleting server: Delete "https://api.ionos.com/cloudapi/v5/datacenters/0b76ed72-f850-4eec-86ce-212b3cce2144/servers
/?depth=10": EOF
2021/03/30 13:08:03 [WARNING] [node-controller-rancher-machine] * error deleting LAN: 405 Method Not Allowed
2021/03/30 13:08:03 [WARNING] [node-controller-rancher-machine]
2021/03/30 13:08:03 [WARNING] [node-controller-rancher-machine]
2021/03/30 13:08:03 [INFO] Removing node worker1 done
2021/03/30 13:10:03 [INFO] Removing node worker2
2021/03/30 13:10:03 [INFO] [node-controller-rancher-machine] About to remove worker2
2021/03/30 13:10:03 [INFO] [node-controller-rancher-machine] WARNING: This action will delete both local reference and remote instance.
2021/03/30 13:10:03 [INFO] [node-controller-rancher-machine] (worker2) Starting deleting resources...
2021/03/30 13:10:03 [INFO] Creating jail for c-trklh
2021/03/30 13:10:03 [INFO] Provisioning node worker1
2021/03/30 13:10:03 [INFO] [node-controller-rancher-machine] Creating CA: /management-state/node/nodes/worker1/certs/ca.pem
2021/03/30 13:10:03 [INFO] [node-controller-rancher-machine] Creating client certificate: /management-state/node/nodes/worker1/certs/cert.pem
2021/03/30 13:10:03 [INFO] [node-controller-rancher-machine] Running pre-create checks...
2021/03/30 13:10:04 [INFO] [node-controller-rancher-machine] (worker2) IPBlock Deleted
2021/03/30 13:10:04 [INFO] [node-controller-rancher-machine] Successfully removed worker2
2021/03/30 13:10:04 [WARNING] [node-controller-rancher-machine] Error removing host "worker2": 5 errors occurred:
2021/03/30 13:10:04 [WARNING] [node-controller-rancher-machine] * error deleting NIC: 404 Not Found
2021/03/30 13:10:04 [WARNING] [node-controller-rancher-machine] * error deleting volume: 405 Method Not Allowed
2021/03/30 13:10:04 [WARNING] [node-controller-rancher-machine] * error deleting server: 405 Method Not Allowed
2021/03/30 13:10:04 [WARNING] [node-controller-rancher-machine] * error deleting LAN: 405 Method Not Allowed
2021/03/30 13:10:04 [WARNING] [node-controller-rancher-machine] * error deleting datacenter: 404 Not Found
2021/03/30 13:10:04 [WARNING] [node-controller-rancher-machine]
2021/03/30 13:10:04 [WARNING] [node-controller-rancher-machine]
2021/03/30 13:10:04 [INFO] Removing node worker2 done
2021/03/30 13:10:06 [INFO] [node-controller-rancher-machine] The default lines below are for a sh/bash shell, you can specify the shell you're using, with the --shell flag.
2021/03/30 13:10:06 [INFO] [node-controller-rancher-machine]
2021/03/30 13:10:06 [INFO] Generating and uploading node config worker1
2021/03/30 13:10:21 [ERROR] [node] enqueing node pool c-trklh:np-4pnlk
2021/03/30 13:10:21 [ERROR] error syncing 'c-trklh/m-fqqvt': handler node-controller: Error with pre-create check: "error getting datacenter: 404 Not Found", requeuing
2021/03/30 13:10:49 [INFO] [node-controller-rancher-machine] The default lines below are for a sh/bash shell, you can specify the shell you're using, with the --shell flag.
2021/03/30 13:10:49 [INFO] [node-controller-rancher-machine]
2021/03/30 13:10:49 [INFO] Generating and uploading node config master1
2021/03/30 13:10:49 [ERROR] [node] enqueing node pool c-trklh:np-jpl42
2021/03/30 13:10:49 [ERROR] error syncing 'c-trklh/m-wjrzd': handler node-controller: Error creating machine: Error waiting for machine to be running: Maximum number of retries (60) exceeded, req
ueuing
2021/03/30 13:12:21 [INFO] Creating jail for c-trklh
2021/03/30 13:12:21 [INFO] Provisioning node worker2
2021/03/30 13:12:21 [INFO] [node-controller-rancher-machine] Creating CA: /management-state/node/nodes/worker2/certs/ca.pem
2021/03/30 13:12:21 [INFO] [node-controller-rancher-machine] Creating client certificate: /management-state/node/nodes/worker2/certs/cert.pem
2021/03/30 13:12:22 [INFO] [node-controller-rancher-machine] Running pre-create checks...
2021/03/30 13:12:24 [INFO] [node-controller-rancher-machine] The default lines below are for a sh/bash shell, you can specify the shell you're using, with the --shell flag.
2021/03/30 13:12:24 [INFO] [node-controller-rancher-machine]
2021/03/30 13:12:24 [INFO] Generating and uploading node config worker2
2021/03/30 13:12:39 [ERROR] [node] enqueing node pool c-trklh:np-4pnlk
2021/03/30 13:12:39 [ERROR] error syncing 'c-trklh/m-zx8l4': handler node-controller: Error with pre-create check: "error getting datacenter: 404 Not Found", requeuing
2021/03/30 13:14:21 [INFO] Removing node master1
2021/03/30 13:14:21 [INFO] [node-controller-rancher-machine] About to remove master1
2021/03/30 13:14:21 [INFO] [node-controller-rancher-machine] WARNING: This action will delete both local reference and remote instance.
2021/03/30 13:14:21 [INFO] [node-controller-rancher-machine] (master1) Starting deleting resources...
2021/03/30 13:14:21 [INFO] Creating jail for c-trklh
2021/03/30 13:14:21 [INFO] Provisioning node master2
2021/03/30 13:14:21 [INFO] [node-controller-rancher-machine] Creating CA: /management-state/node/nodes/master2/certs/ca.pem
2021/03/30 13:14:21 [INFO] [node-controller-rancher-machine] Creating client certificate: /management-state/node/nodes/master2/certs/cert.pem
2021/03/30 13:14:21 [INFO] [node-controller-rancher-machine] Running pre-create checks...
2021/03/30 13:14:22 [INFO] [node-controller-rancher-machine] (master1) IPBlock Deleted
2021/03/30 13:14:22 [INFO] [node-controller-rancher-machine] Successfully removed master1
2021/03/30 13:14:22 [WARNING] [node-controller-rancher-machine] Error removing host "master1": 4 errors occurred:
2021/03/30 13:14:22 [WARNING] [node-controller-rancher-machine] * error deleting NIC: 404 Not Found
2021/03/30 13:14:22 [WARNING] [node-controller-rancher-machine] * error deleting volume: 404 Not Found
2021/03/30 13:14:22 [WARNING] [node-controller-rancher-machine] * error deleting server: 404 Not Found
2021/03/30 13:14:22 [WARNING] [node-controller-rancher-machine] * error deleting LAN: 404 Not Found
2021/03/30 13:14:22 [WARNING] [node-controller-rancher-machine]
2021/03/30 13:14:22 [WARNING] [node-controller-rancher-machine]
2021/03/30 13:14:22 [INFO] Removing node master1 done
2021/03/30 13:14:24 [INFO] [node-controller-rancher-machine] The default lines below are for a sh/bash shell, you can specify the shell you're using, with the --shell flag.
2021/03/30 13:14:24 [INFO] [node-controller-rancher-machine]
2021/03/30 13:14:24 [INFO] Generating and uploading node config master2
2021/03/30 13:14:39 [ERROR] [node] enqueing node pool c-trklh:np-jpl42
2021/03/30 13:14:39 [ERROR] error syncing 'c-trklh/m-n6ns6': handler node-controller: Error with pre-create check: "error getting datacenter: 404 Not Found", requeuing
Just released 5.0.3 - which includes changes so Rancher does not delete an existing Data Center. Thanks for pointing this out.
Tested successfully. 5.0.3 fixes the issue. Thank you very much! :)
Description
I have been following the guide https://rancher.com/docs/rancher/v2.x/en/quick-start-guide/deployment/quickstart-manual-setup/. I added the rancher driver as a node driver. I created a node template which contained a datacenterID of an existing datacenter and afterwards cerated a cluster. Once i began the process, the datacenter was deleted.
Expected behavior
rancher worker nodes should have been created in the datacenter specified in the node template.
How to Reproduce