ComputeCanada / magic_castle

Terraform modules to replicate the HPC user experience in the cloud
MIT License
137 stars 39 forks source link

Extra network interface for a Cephfs setup #248

Open poquirion opened 1 year ago

poquirion commented 1 year ago

For perfomance/traffic optimization, we will have a cephfs deployment on a separate network in openstack. This mean that a second network will need to be attatche to login, compute and dtn nodes. I guess that this part all need to be done in this repos, not the puppet one.

I envision this as a config that would look like

subnet_id = {
main = <id>
secondary = <id>
}

or

subnet_id =  <id>
secondary_subnet_id =  <id>

or something else....

There is less refactoring the second case, but the first config looks better to me.

poquirion commented 1 year ago

Once that is done, the puppet part of mounting cephfs will be the same since selecting the right internal ip address will make the cephfs client use of the right interface.

cmd-ntrf commented 6 months ago

A potential for this issue was to drop in the openstack folder a Terraform file that would contain the following:

resource "openstack_networking_port_v2" "nic_ceph" {
  for_each           = module.design.instances
  name               = format("%s-%s-ceph-port", var.cluster_name, each.key)
  network_id         = "00b327b4-4fb2-4ed8-a7f2-6ff49e3b7e7c"
  security_group_ids = concat(
    [
      openstack_networking_secgroup_v2.global.id
    ],
    [
      for tag, value in openstack_networking_secgroup_v2.external: value.id if contains(each.value.tags, tag)
    ]
  )
}

resource "openstack_compute_interface_attach_v2" "extra_network" {
  for_each    = module.design.instances_to_build
  instance_id = openstack_compute_instance_v2.instances[each.key].id
  port_id     = "openstack_networking_port_v2.nic_ceph[each.key].id"
}

The file has to be put next to the other openstack module files because the Magic Castle provider modules do not output the ids of the instances nor the ids of the security groups. Otherwise, you would have been able to create these ressources outside of the openstack module.

poquirion commented 6 months ago

I added this:

     1  resource "openstack_networking_port_v2" "nic_ceph" {
     2    for_each           = module.design.instances
     3    name               = format("%s-%s-ceph-port", var.cluster_name, each.key)
     4    network_id         = "00b327b4-4fb2-4ed8-a7f2-6ff49e3b7e7c"
     5    security_group_ids = concat(
     6      [
     7        openstack_networking_secgroup_v2.global.id
     8      ],
     9      [
    10        for tag, value in openstack_networking_secgroup_v2.external: value.id if contains(each.value.tags, tag)
    11      ]
    12    )
    13  }
    14  
    15  resource "openstack_compute_interface_attach_v2" "extra_network" {
    16    for_each    = module.design.instances_to_build
    17    instance_id = openstack_compute_instance_v2.instances[each.key].id
    18    port_id     = "openstack_networking_port_v2.nic_ceph[each.key].id"
    19  }

and here is the error I get

╷
│ Error: Bad request with: [POST https://juno.calculquebec.ca:8774/v2.1/servers/d4bc3b5b-5c09-4cf7-9f49-41937a254cb6/os-interface], error message: {"badRequest": {"code": 400, "message": "Invalid input for field/attribute port_id. Value: openstack_networking_port_v2.nic_ceph[each.key].id. 'openstack_networking_port_v2.nic_ceph[each.key].id' is not a 'uuid'"}}
│ 
│   with module.openstack.openstack_compute_interface_attach_v2.extra_network["corbeau-1"],
│   on .terraform/modules/openstack/openstack/cephfs_net.tf line 15, in resource "openstack_compute_interface_attach_v2" "extra_network":
│   15: resource "openstack_compute_interface_attach_v2" "extra_network" {
│ 
╵
╷
│ Error: Bad request with: [POST https://juno.calculquebec.ca:8774/v2.1/servers/e2dbf3f9-02ef-4bd3-b45d-9a232b5de3da/os-interface], error message: {"badRequest": {"code": 400, "message": "Invalid input for field/attribute port_id. Value: openstack_networking_port_v2.nic_ceph[each.key].id. 'openstack_networking_port_v2.nic_ceph[each.key].id' is not a 'uuid'"}}
│ 
│   with module.openstack.openstack_compute_interface_attach_v2.extra_network["cc-1"],
│   on .terraform/modules/openstack/openstack/cephfs_net.tf line 15, in resource "openstack_compute_interface_attach_v2" "extra_network":
│   15: resource "openstack_compute_interface_attach_v2" "extra_network" {
│ 
╵
╷
│ Error: Bad request with: [POST https://juno.calculquebec.ca:8774/v2.1/servers/063df52a-722c-49f9-808c-c9eac5cc78c0/os-interface], error message: {"badRequest": {"code": 400, "message": "Invalid input for field/attribute port_id. Value: openstack_networking_port_v2.nic_ceph[each.key].id. 'openstack_networking_port_v2.nic_ceph[each.key].id' is not a 'uuid'"}}
│ 
│   with module.openstack.openstack_compute_interface_attach_v2.extra_network["cmgmt1"],
│   on .terraform/modules/openstack/openstack/cephfs_net.tf line 15, in resource "openstack_compute_interface_attach_v2" "extra_network":
│   15: resource "openstack_compute_interface_attach_v2" "extra_network" {
│ 
╵

mmm, I see the weirdly placed quotes now.