oxidecomputer / omicron

Omicron: Oxide control plane
Mozilla Public License 2.0
252 stars 40 forks source link

Could automatically apply switch-port settings to newly created links #6562

Open bnaecker opened 2 months ago

bnaecker commented 2 months ago

I'm attempting to create links on the london environment through the CLI for the first time. The process is a bit confusing, because one needs to create a link and also apply some switch-port settings to that for the link to be created and enabled. The settings object gets created implicitly it looks like -- just doing:

bnaecker@flint : ~/file-cabinet/oxide/oxide.rs $ ./target/release/oxide --profile london system networking link add --rack b373c1a2-8cd1-457f-be2a-1eec43bdd866 --switch switch1 --port qsfp3 --fec rs --speed 100g

Results in:

bnaecker@flint : ~/file-cabinet/oxide/oxide.rs $ ./target/release/oxide --profile london system networking switch-port-settings list
[
  {
    "description": "initial uplink configuration",
    "id": "b3998a2e-95cb-40ff-919a-67c3771edb80",
    "name": "default-uplink0",
    "time_created": "2024-09-11T19:49:33.302277Z",
    "time_modified": "2024-09-11T19:49:33.302277Z"
  }, {
    "description": "initial uplink configuration",
    "id": "0b8a1b26-d357-4248-b1f9-dc695134cf4d",
    "name": "default-uplink1",
    "time_created": "2024-09-11T19:49:33.560949Z",
    "time_modified": "2024-09-11T19:49:33.560949Z"
  }, {
    "description": "",
    "id": "6800e1a8-bd73-4d5d-8a5f-20733d8751e1",
    "name": "switch0-qsfp3",
    "time_created": "2024-09-12T17:28:45.518309Z",
    "time_modified": "2024-09-12T17:28:45.518309Z"
  }, {
    "description": "",
    "id": "7c60f2e4-f457-4b9f-828d-cc57ad39af0c",
    "name": "switch1-qsfp3",
    "time_created": "2024-09-12T17:29:24.525219Z",
    "time_modified": "2024-09-12T17:29:24.525219Z"
  }
]

That switch1-qsfp3 settings was created automatically. Now to enable the link, those settings have to be applied, with:

bnaecker@flint : ~/file-cabinet/oxide/oxide.rs $ ./target/release/oxide --profile london system hardware switch-port apply-settings --port qsfp3 --rack-id b373c1a2-8cd1-457f-be2a-1eec43bdd866 --switch-location switch0 --port-settings 7c60f2e4-f457-4b9f-828d-cc57ad39af0c

Since we're creating the settings automatically, it'd be nice if we could automatically apply them as well. OTOH, if we don't want to do that, we should prompt the user to do so at their discretion, and make especially clear that the link will not exist until they do so.

elaine-oxide commented 2 months ago

I am running a4x2 with:

I tried the following:

$ export rack=`oxide system hardware rack list | jq -r .[0].id`

$ oxide system networking link add --rack $rack --switch switch1 --port qsfp1 --fec none --speed 100g

Then I tried this and got a suboptimal error message.

$ oxide system networking bgp auth --rack $rack --switch switch1 --port qsfp1 --peer 169.254.40.1 --authstring <password>
Port settings uninitialized. Initialize by creating a link.

I see that the last entry was automatically created during link add:

$ oxide system networking switch-port-settings list
[
  {
    "description": "initial uplink configuration",
    "id": "a0ec41a7-2ca4-43a7-9a8d-6cb106af4728",
    "name": "default-uplink0",
    "time_created": "2024-09-19T03:02:26.552769Z",
    "time_modified": "2024-09-19T03:02:26.552769Z"
  }, {
    "description": "switch port settings",
    "id": "837c5d48-1cb9-4bd2-be8d-97eb46289627",
    "name": "default-uplink1",
    "time_created": "2024-09-19T03:45:52.303770Z",
    "time_modified": "2024-09-19T03:45:52.303770Z"
  }, {
    "description": "",
    "id": "3abb3798-c552-48ca-b90b-5497df73c032",
    "name": "switch1-qsfp1",
    "time_created": "2024-09-19T03:49:08.085046Z",
    "time_modified": "2024-09-19T03:49:08.085046Z"
  }
]

Without the automatic apply of switch-port settings to newly created links, we hit this line of code for the suboptimal error message during the above invocation of bgp auth: https://github.com/oxidecomputer/oxide.rs/blob/5ffdf5a0325bb62f3235a2e3a2f0d73f8452525a/cli/src/cmd_net.rs#L1873

Then I run the following:

oxide system hardware switch-port apply-settings --port qsfp1 --rack-id $rack --switch-location switch1 --port-settings switch1-qsfp1

Now I get a better error message:

$ oxide system networking bgp auth --rack $rack --switch switch1 --port qsfp1 --peer 169.254.40.1 --authstring <password>
specified peer does not exist

If we choose to NOT automatically apply switch-port settings to newly created links, then we need better error handling for cases that hit the above line of code that fall under this category.

rcgoodfellow commented 2 months ago

I think we should fix this in oxide.rs by having system networking link add just make the association to a switch port when it creates the settings. The command takes rack/switch/port parameters, so it's clear that association is indended, we're just stopping short of making the association as a part of initializing switch port settings with a link.