Closed sanjaysrikakulam closed 1 year ago
Q: we do have a GPU image with prebuild CUDA do we?
Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
-/+ destroy and then create replacement
Terraform will perform the following actions:
# openstack_compute_instance_v2.dilmurat must be replaced
-/+ resource "openstack_compute_instance_v2" "dilmurat" {
~ access_ip_v4 = "192.52.42.241" -> (known after apply)
+ access_ip_v6 = (known after apply)
~ all_metadata = {} -> (known after apply)
~ all_tags = [] -> (known after apply)
~ availability_zone = "nova" -> (known after apply)
~ flavor_id = "14e797a2-3146-42ab-aa78-218e247cad7a" -> (known after apply)
~ flavor_name = "g1.c8m20g1" -> "g1.c8m20g1d50"
~ id = "980400a5-7d12-4da8-b7aa-46a259707a5f" -> (known after apply)
~ image_id = "5f3fc2b3-0803-44cc-abe5-40335a6e6bd6" -> "f5b82cb0-03b4-44f0-8ce5-33f15c53f89b" # forces replacement
~ image_name = "vggp-gpu-v60-j310-1fad751e0150-main" -> (known after apply)
name = "dilmurat dedicated VM"
~ region = "Freiburg" -> (known after apply)
- tags = [] -> null
# (6 unchanged attributes hidden)
~ network {
~ fixed_ip_v4 = "192.52.42.241" -> (known after apply)
+ fixed_ip_v6 = (known after apply)
+ floating_ip = (known after apply)
~ mac = "fa:16:3e:57:9a:a4" -> (known after apply)
name = "public"
+ port = (known after apply)
~ uuid = "60775850-0c04-4a6d-b607-ad1d75ee2900" -> (known after apply)
# (1 unchanged attribute hidden)
}
}
# openstack_compute_volume_attach_v2.dilmurat-va must be replaced
-/+ resource "openstack_compute_volume_attach_v2" "dilmurat-va" {
~ device = "/dev/vdb" -> (known after apply)
~ id = "980400a5-7d12-4da8-b7aa-46a259707a5f/981a98dc-cc05-4ed9-9a2b-e3018ccd627d" -> (known after apply)
~ instance_id = "980400a5-7d12-4da8-b7aa-46a259707a5f" -> (known after apply) # forces replacement
~ region = "Freiburg" -> (known after apply)
# (1 unchanged attribute hidden)
}
Plan: 2 to add, 0 to change, 2 to destroy.
─────────────────────────────────────────────────────────────────────────────
Saved the plan to: tf.plan
To perform exactly these actions, run the following command to apply:
terraform apply "tf.plan" ```
Q: we do have a GPU image with prebuild CUDA do we?
The VGCN repo shows that the installation is turned off: https://github.com/usegalaxy-eu/vgcn/blob/00829423b35b5da3b7fbe4d49dfacea85d554d57/ansible-roles/group_vars/gpu.yml#L2-L5
Not sure, why though. So we inject the installation through cloud-init in our TF files.
The previous image has issues with installing CUDA and NVIDIA drivers due to the broken CUDA repo GPG key. Manual debugging and installation attempts do not fix due to the lack of storage on the root disk so this PR updates the flavor along with the GPU image (using the same image as the one we use for our worker nodes).