Blueprint placement of nexus puts it on a sled already containing nexus instead of new sled with no nexuses

andrewjstone commented 9 months ago

I created a 3 node testbed off the code on main, and then added node g2. I ran the planner enough times to deploy ntp and crucible zones to sled g2. I then ran the planner to try to get it to place the nexus zone on g2. Prior to testbed launch I had previously modified the code as instructed by @jgallagher so that there would be a 4th nexus instance to place by the planner.

I noticed the zone wasn't being placed on sled g2 when looking at blueprints and that the generation number for that sled was stuck at 3 with only crucible and ntp zones. I then ran omdb nexus blueprints regenerate a few times to see if the nexus zone would get placed on g2 after some time. I then realized that there was a second nexus running on sled g3. I then went and ran some commands backtracking through the blueprints to see if the second nexus was actually placed after the first blueprint was constructed and it looks like it was. So it appears we have a bug in the planner. The omdb commands I used to diagnose this after a few calls to regenerate are below.

root@oxz_switch:~# omdb nexus blueprints show 0802f843-8d5e-4eb4-83a3-a6859a52c2f6
note: Nexus URL not specified.  Will pick one from DNS.
note: using DNS server for subnet fd00:1122:3344::/48
note: (if this is not right, use --dns-server to specify an alternate DNS server)
note: using Nexus URL http://[fd00:1122:3344:102::5]:12221
blueprint  0802f843-8d5e-4eb4-83a3-a6859a52c2f6
parent:    131345c7-77cf-4e77-a797-eeb5e6affc60
created by 50b61e22-44d6-4cda-92a9-5b0cb3c74439 (likely a Nexus instance)
created at 2024-02-14T05:03:56.571Z
comment:
zones:

  sled 24801d19-fc65-4d5d-a201-92f097c6c1f2: Omicron zones at generation 5
    0d4bc671-6f13-4634-8cdb-442aedf99084 cockroach_db
    14692ca0-9542-47fd-928e-5ffd325e73db cockroach_db
    31b83ec8-34cc-4825-9e61-ae7e046413bb crucible_pantry
    4661bb3f-d2bf-4a84-b76c-0f97243ed1c2 boundary_ntp
    4731387c-85e8-4f95-a9ac-8d51176057ac oximeter
    488f0a62-3212-450f-b060-8654a0e86333 crucible
    50b61e22-44d6-4cda-92a9-5b0cb3c74439 nexus
    92c17b1d-58b1-442a-b601-723c7510a929 crucible
    d5dd0c75-f7c5-4d21-835f-3b9c16d1b617 crucible
    d92156e2-e8e4-4e1e-aaa2-9014338f2235 internal_dns
    eecd99ca-bdf5-4135-91b5-bd7fa0ab8d7c crucible
    fa3c8eb0-cc12-49f9-ab43-ae8555986325 crucible
  sled 28934a76-4ffd-4a5b-976f-5f942fbe407d: Omicron zones at generation 5
    1b6d991b-9048-4aaf-9683-913a88156fa1 crucible
    601662ff-8037-4ea8-b11c-990658db6a4f crucible
    611fffcc-9499-4cdb-990d-e33e1327c06b cockroach_db
    9795ebfc-dcfb-4277-a2ca-147f3c857476 internal_dns
    9a0e61de-f667-403d-987f-567f2d762a15 crucible
    9e038eec-7043-4d6f-8b70-a16c63c78a6c crucible
    b723b5b7-83d1-4bfe-843e-d0a74d98b2a9 crucible_pantry
    b98e84b6-3376-43e5-90f0-e786a709b6e6 external_dns
    c63f9527-8c0e-4e44-9533-91559ff91b76 crucible
    c651da96-4a17-479e-8787-3075eb2d139d cockroach_db
    e0a0381a-2129-4b4d-b7a8-67f4605010d3 nexus
    f9f73c2d-adaf-4e6f-9ffc-b1110118c209 boundary_ntp
  sled 6bc7dd27-afe4-4af5-a6fc-9c7fb27a6488: Omicron zones at generation 6
    09b32ff2-9bc6-4928-b40d-3ca46998a484 crucible
    0fc9985d-fd34-43ef-abf6-7f10b2069f43 internal_ntp
    14845295-b409-4e02-aa00-88c1e11997fb crucible
    1d6f2bd4-3d36-49bb-9a36-b6319d69b645 clickhouse
    2e656513-f036-4352-97a2-04a374127257 nexus
    571e69eb-78f2-4000-bc94-b8a716f699ea nexus
    6898247f-9af8-49cc-a60f-c16296f51be8 crucible
    8950daed-bfed-4096-932c-9eab97fda657 external_dns
    908f4ca0-6d2b-4dfe-9912-47070a600d29 cockroach_db
    a2fe4c49-85ac-48f7-a96a-b6039e39f903 crucible
    adb5ab92-e992-4dbc-a369-04e42a0aabf7 crucible_pantry
    d62410f9-1a2c-4d71-8fb1-111571998500 internal_dns
    e2c9c143-fe42-48e9-a8ed-1077e2b4d8bc crucible
  sled 7b56eff6-69ad-40e6-ab99-8510c52795dc: Omicron zones at generation 3
    0fa13e73-7340-486e-b1bc-2c0802861faa crucible
    3637bfdf-f4cf-49c4-a3e7-dcd9482fa938 crucible
    64fbd7d3-0abb-4a36-a886-da4f05e24808 crucible
    67b119b9-9ec6-4e0c-8789-34f662cc437a internal_ntp
    c015e3fc-6498-49df-ba20-94c6f7da3561 crucible
    d294aba1-bb3f-40c5-bb8a-bbc2a966779e crucible
root@oxz_switch:~# omdb db sleds
note: database URL not specified.  Will search DNS.
note: (override with --db-url or OMDB_DB_URL)
note: using DNS server for subnet fd00:1122:3344::/48
note: (if this is not right, use --dns-server to specify an alternate DNS server)
note: using database URL postgresql://root@[fd00:1122:3344:102::4]:32221,[fd00:1122:3344:102::3]:32221,[fd00:1122:3344:101::4]:32221,[fd00:1122:3344:103::3]:32221,[fd00:1122:3344:101::3]:32221/omicron?sslmode=disable
note: database schema version matches expected (33.0.1)
SERIAL IP                            ROLE     ID
g1     [fd00:1122:3344:102::1]:12345 -        24801d19-fc65-4d5d-a201-92f097c6c1f2
g0     [fd00:1122:3344:101::1]:12345 scrimlet 28934a76-4ffd-4a5b-976f-5f942fbe407d
g3     [fd00:1122:3344:103::1]:12345 scrimlet 6bc7dd27-afe4-4af5-a6fc-9c7fb27a6488
g2     [fd00:1122:3344:121::1]:12345 -        7b56eff6-69ad-40e6-ab99-8510c52795dc
root@oxz_switch:~# omdb nexus blueprints target show
note: Nexus URL not specified.  Will pick one from DNS.
note: using DNS server for subnet fd00:1122:3344::/48
note: (if this is not right, use --dns-server to specify an alternate DNS server)
note: using Nexus URL http://[fd00:1122:3344:102::5]:12221
target blueprint: 131345c7-77cf-4e77-a797-eeb5e6affc60
made target at:   2024-02-14 04:51:55.450189 UTC
enabled:          true
root@oxz_switch:~# omdb nexus blueprints show 131345c7-77cf-4e77-a797-eeb5e6affc60
note: Nexus URL not specified.  Will pick one from DNS.
note: using DNS server for subnet fd00:1122:3344::/48
note: (if this is not right, use --dns-server to specify an alternate DNS server)
note: using Nexus URL http://[fd00:1122:3344:102::5]:12221
blueprint  131345c7-77cf-4e77-a797-eeb5e6affc60
parent:    4c620258-7a88-4b81-83ac-90591ca59927
created by 50b61e22-44d6-4cda-92a9-5b0cb3c74439 (likely a Nexus instance)
created at 2024-02-14T04:51:34.724Z
comment: sled 7b56eff6-69ad-40e6-ab99-8510c52795dc: add zones
zones:

  sled 24801d19-fc65-4d5d-a201-92f097c6c1f2: Omicron zones at generation 5
    0d4bc671-6f13-4634-8cdb-442aedf99084 cockroach_db
    14692ca0-9542-47fd-928e-5ffd325e73db cockroach_db
    31b83ec8-34cc-4825-9e61-ae7e046413bb crucible_pantry
    4661bb3f-d2bf-4a84-b76c-0f97243ed1c2 boundary_ntp
    4731387c-85e8-4f95-a9ac-8d51176057ac oximeter
    488f0a62-3212-450f-b060-8654a0e86333 crucible
    50b61e22-44d6-4cda-92a9-5b0cb3c74439 nexus
    92c17b1d-58b1-442a-b601-723c7510a929 crucible
    d5dd0c75-f7c5-4d21-835f-3b9c16d1b617 crucible
    d92156e2-e8e4-4e1e-aaa2-9014338f2235 internal_dns
    eecd99ca-bdf5-4135-91b5-bd7fa0ab8d7c crucible
    fa3c8eb0-cc12-49f9-ab43-ae8555986325 crucible
  sled 28934a76-4ffd-4a5b-976f-5f942fbe407d: Omicron zones at generation 5
    1b6d991b-9048-4aaf-9683-913a88156fa1 crucible
    601662ff-8037-4ea8-b11c-990658db6a4f crucible
    611fffcc-9499-4cdb-990d-e33e1327c06b cockroach_db
    9795ebfc-dcfb-4277-a2ca-147f3c857476 internal_dns
    9a0e61de-f667-403d-987f-567f2d762a15 crucible
    9e038eec-7043-4d6f-8b70-a16c63c78a6c crucible
    b723b5b7-83d1-4bfe-843e-d0a74d98b2a9 crucible_pantry
    b98e84b6-3376-43e5-90f0-e786a709b6e6 external_dns
    c63f9527-8c0e-4e44-9533-91559ff91b76 crucible
    c651da96-4a17-479e-8787-3075eb2d139d cockroach_db
    e0a0381a-2129-4b4d-b7a8-67f4605010d3 nexus
    f9f73c2d-adaf-4e6f-9ffc-b1110118c209 boundary_ntp
  sled 6bc7dd27-afe4-4af5-a6fc-9c7fb27a6488: Omicron zones at generation 6
    09b32ff2-9bc6-4928-b40d-3ca46998a484 crucible
    0fc9985d-fd34-43ef-abf6-7f10b2069f43 internal_ntp
    14845295-b409-4e02-aa00-88c1e11997fb crucible
    1d6f2bd4-3d36-49bb-9a36-b6319d69b645 clickhouse
    2e656513-f036-4352-97a2-04a374127257 nexus
    571e69eb-78f2-4000-bc94-b8a716f699ea nexus
    6898247f-9af8-49cc-a60f-c16296f51be8 crucible
    8950daed-bfed-4096-932c-9eab97fda657 external_dns
    908f4ca0-6d2b-4dfe-9912-47070a600d29 cockroach_db
    a2fe4c49-85ac-48f7-a96a-b6039e39f903 crucible
    adb5ab92-e992-4dbc-a369-04e42a0aabf7 crucible_pantry
    d62410f9-1a2c-4d71-8fb1-111571998500 internal_dns
    e2c9c143-fe42-48e9-a8ed-1077e2b4d8bc crucible
  sled 7b56eff6-69ad-40e6-ab99-8510c52795dc: Omicron zones at generation 3
    0fa13e73-7340-486e-b1bc-2c0802861faa crucible
    3637bfdf-f4cf-49c4-a3e7-dcd9482fa938 crucible
    64fbd7d3-0abb-4a36-a886-da4f05e24808 crucible
    67b119b9-9ec6-4e0c-8789-34f662cc437a internal_ntp
    c015e3fc-6498-49df-ba20-94c6f7da3561 crucible
    d294aba1-bb3f-40c5-bb8a-bbc2a966779e crucible
root@oxz_switch:~# omdb nexus blueprints list
note: Nexus URL not specified.  Will pick one from DNS.
note: using DNS server for subnet fd00:1122:3344::/48
note: (if this is not right, use --dns-server to specify an alternate DNS server)
note: using Nexus URL http://[fd00:1122:3344:102::5]:12221
T ID                                   PARENT                               TIME_CREATED
  0802f843-8d5e-4eb4-83a3-a6859a52c2f6 131345c7-77cf-4e77-a797-eeb5e6affc60 2024-02-14T05:03:56.571Z
* 131345c7-77cf-4e77-a797-eeb5e6affc60 4c620258-7a88-4b81-83ac-90591ca59927 2024-02-14T04:51:34.724Z
  49ee7f1c-f32e-4d28-b590-11311b3a1aa6 131345c7-77cf-4e77-a797-eeb5e6affc60 2024-02-14T04:58:31.994Z
  4c620258-7a88-4b81-83ac-90591ca59927 c182a035-d2e2-4f54-b082-bef9bd5ef640 2024-02-14T04:49:53.392Z
  65c57bd2-4239-4a02-b6f3-e3510e28851e 131345c7-77cf-4e77-a797-eeb5e6affc60 2024-02-14T04:53:58.935Z
  96906e92-cc9f-4c88-beaa-b21af47bd814 131345c7-77cf-4e77-a797-eeb5e6affc60 2024-02-14T04:54:39.629Z
  9f5e4ab1-16c4-4794-acd1-441cde2da639 131345c7-77cf-4e77-a797-eeb5e6affc60 2024-02-14T05:02:35.352Z
  c182a035-d2e2-4f54-b082-bef9bd5ef640 <none>                               2024-02-14T04:48:29.860Z
  fc8cd5da-7ba1-45d5-b490-59a49f2e1202 131345c7-77cf-4e77-a797-eeb5e6affc60 2024-02-14T05:06:59.100Z
root@oxz_switch:~# omdb nexus blueprints show  4c620258-7a88-4b81-83ac-90591ca59927
note: Nexus URL not specified.  Will pick one from DNS.
note: using DNS server for subnet fd00:1122:3344::/48
note: (if this is not right, use --dns-server to specify an alternate DNS server)
note: using Nexus URL http://[fd00:1122:3344:102::5]:12221
blueprint  4c620258-7a88-4b81-83ac-90591ca59927
parent:    c182a035-d2e2-4f54-b082-bef9bd5ef640
created by 50b61e22-44d6-4cda-92a9-5b0cb3c74439 (likely a Nexus instance)
created at 2024-02-14T04:49:53.392Z
comment: sled 7b56eff6-69ad-40e6-ab99-8510c52795dc: add NTP zone
zones:

  sled 24801d19-fc65-4d5d-a201-92f097c6c1f2: Omicron zones at generation 5
    0d4bc671-6f13-4634-8cdb-442aedf99084 cockroach_db
    14692ca0-9542-47fd-928e-5ffd325e73db cockroach_db
    31b83ec8-34cc-4825-9e61-ae7e046413bb crucible_pantry
    4661bb3f-d2bf-4a84-b76c-0f97243ed1c2 boundary_ntp
    4731387c-85e8-4f95-a9ac-8d51176057ac oximeter
    488f0a62-3212-450f-b060-8654a0e86333 crucible
    50b61e22-44d6-4cda-92a9-5b0cb3c74439 nexus
    92c17b1d-58b1-442a-b601-723c7510a929 crucible
    d5dd0c75-f7c5-4d21-835f-3b9c16d1b617 crucible
    d92156e2-e8e4-4e1e-aaa2-9014338f2235 internal_dns
    eecd99ca-bdf5-4135-91b5-bd7fa0ab8d7c crucible
    fa3c8eb0-cc12-49f9-ab43-ae8555986325 crucible
  sled 28934a76-4ffd-4a5b-976f-5f942fbe407d: Omicron zones at generation 5
    1b6d991b-9048-4aaf-9683-913a88156fa1 crucible
    601662ff-8037-4ea8-b11c-990658db6a4f crucible
    611fffcc-9499-4cdb-990d-e33e1327c06b cockroach_db
    9795ebfc-dcfb-4277-a2ca-147f3c857476 internal_dns
    9a0e61de-f667-403d-987f-567f2d762a15 crucible
    9e038eec-7043-4d6f-8b70-a16c63c78a6c crucible
    b723b5b7-83d1-4bfe-843e-d0a74d98b2a9 crucible_pantry
    b98e84b6-3376-43e5-90f0-e786a709b6e6 external_dns
    c63f9527-8c0e-4e44-9533-91559ff91b76 crucible
    c651da96-4a17-479e-8787-3075eb2d139d cockroach_db
    e0a0381a-2129-4b4d-b7a8-67f4605010d3 nexus
    f9f73c2d-adaf-4e6f-9ffc-b1110118c209 boundary_ntp
  sled 6bc7dd27-afe4-4af5-a6fc-9c7fb27a6488: Omicron zones at generation 6
    09b32ff2-9bc6-4928-b40d-3ca46998a484 crucible
    0fc9985d-fd34-43ef-abf6-7f10b2069f43 internal_ntp
    14845295-b409-4e02-aa00-88c1e11997fb crucible
    1d6f2bd4-3d36-49bb-9a36-b6319d69b645 clickhouse
    2e656513-f036-4352-97a2-04a374127257 nexus
    571e69eb-78f2-4000-bc94-b8a716f699ea nexus
    6898247f-9af8-49cc-a60f-c16296f51be8 crucible
    8950daed-bfed-4096-932c-9eab97fda657 external_dns
    908f4ca0-6d2b-4dfe-9912-47070a600d29 cockroach_db
    a2fe4c49-85ac-48f7-a96a-b6039e39f903 crucible
    adb5ab92-e992-4dbc-a369-04e42a0aabf7 crucible_pantry
    d62410f9-1a2c-4d71-8fb1-111571998500 internal_dns
    e2c9c143-fe42-48e9-a8ed-1077e2b4d8bc crucible
  sled 7b56eff6-69ad-40e6-ab99-8510c52795dc: Omicron zones at generation 2
    67b119b9-9ec6-4e0c-8789-34f662cc437a internal_ntp
root@oxz_switch:~# omdb nexus blueprints show c182a035-d2e2-4f54-b082-bef9bd5ef640
note: Nexus URL not specified.  Will pick one from DNS.
note: using DNS server for subnet fd00:1122:3344::/48
note: (if this is not right, use --dns-server to specify an alternate DNS server)
note: using Nexus URL http://[fd00:1122:3344:102::5]:12221
blueprint  c182a035-d2e2-4f54-b082-bef9bd5ef640
parent:    <none>
created by 50b61e22-44d6-4cda-92a9-5b0cb3c74439 (likely a Nexus instance)
created at 2024-02-14T04:48:29.860Z
comment: from collection 73ed7551-658f-4c5f-a38f-d65d3d42d96c
zones:

  sled 24801d19-fc65-4d5d-a201-92f097c6c1f2: Omicron zones at generation 5
    0d4bc671-6f13-4634-8cdb-442aedf99084 cockroach_db
    14692ca0-9542-47fd-928e-5ffd325e73db cockroach_db
    31b83ec8-34cc-4825-9e61-ae7e046413bb crucible_pantry
    4661bb3f-d2bf-4a84-b76c-0f97243ed1c2 boundary_ntp
    4731387c-85e8-4f95-a9ac-8d51176057ac oximeter
    488f0a62-3212-450f-b060-8654a0e86333 crucible
    50b61e22-44d6-4cda-92a9-5b0cb3c74439 nexus
    92c17b1d-58b1-442a-b601-723c7510a929 crucible
    d5dd0c75-f7c5-4d21-835f-3b9c16d1b617 crucible
    d92156e2-e8e4-4e1e-aaa2-9014338f2235 internal_dns
    eecd99ca-bdf5-4135-91b5-bd7fa0ab8d7c crucible
    fa3c8eb0-cc12-49f9-ab43-ae8555986325 crucible
  sled 28934a76-4ffd-4a5b-976f-5f942fbe407d: Omicron zones at generation 5
    1b6d991b-9048-4aaf-9683-913a88156fa1 crucible
    601662ff-8037-4ea8-b11c-990658db6a4f crucible
    611fffcc-9499-4cdb-990d-e33e1327c06b cockroach_db
    9795ebfc-dcfb-4277-a2ca-147f3c857476 internal_dns
    9a0e61de-f667-403d-987f-567f2d762a15 crucible
    9e038eec-7043-4d6f-8b70-a16c63c78a6c crucible
    b723b5b7-83d1-4bfe-843e-d0a74d98b2a9 crucible_pantry
    b98e84b6-3376-43e5-90f0-e786a709b6e6 external_dns
    c63f9527-8c0e-4e44-9533-91559ff91b76 crucible
    c651da96-4a17-479e-8787-3075eb2d139d cockroach_db
    e0a0381a-2129-4b4d-b7a8-67f4605010d3 nexus
    f9f73c2d-adaf-4e6f-9ffc-b1110118c209 boundary_ntp
  sled 6bc7dd27-afe4-4af5-a6fc-9c7fb27a6488: Omicron zones at generation 5
    09b32ff2-9bc6-4928-b40d-3ca46998a484 crucible
    0fc9985d-fd34-43ef-abf6-7f10b2069f43 internal_ntp
    14845295-b409-4e02-aa00-88c1e11997fb crucible
    1d6f2bd4-3d36-49bb-9a36-b6319d69b645 clickhouse
    571e69eb-78f2-4000-bc94-b8a716f699ea nexus
    6898247f-9af8-49cc-a60f-c16296f51be8 crucible
    8950daed-bfed-4096-932c-9eab97fda657 external_dns
    908f4ca0-6d2b-4dfe-9912-47070a600d29 cockroach_db
    a2fe4c49-85ac-48f7-a96a-b6039e39f903 crucible
    adb5ab92-e992-4dbc-a369-04e42a0aabf7 crucible_pantry
    d62410f9-1a2c-4d71-8fb1-111571998500 internal_dns
    e2c9c143-fe42-48e9-a8ed-1077e2b4d8bc crucible
  sled 7b56eff6-69ad-40e6-ab99-8510c52795dc: Omicron zones at generation 1

jgallagher commented 9 months ago

I think this is not a bug, and instead a side effect of effectively telling the planner that you want 4 Nexus instances on what is effectively a 3-node system. The ordering here is:

RSS kicked off with 3 nodes.
RSS generates a plan that allocates 3 Nexus instances, one to each node.
RSS completes; all 3 sleds have 1 Nexus instance.
A fourth sled is added to the cluster at the trust quorum level. It has no zones (yet).
We generate a base blueprint from the latest inventory collection. (This could swap places with the previous step with no effect, I think.)
We ask the planner to generate a new blueprint.

Inputs to the planner in step 6 include:

There are four sleds.
The fourth sled has no zones.
There are three Nexus instances, one on each of the first three sleds.
The policy says there should be 4 Nexus instances.

What should the planner do given this set of inputs? The current implementation (which I think is correct based on conversations we've had so far) is that when generating a plan, if a sled doesn't yet have an NTP zone, the planner should give it an NTP zone and nothing else. So the planner does that, but then sees that the policy says there should be 4 Nexus instances. Therefore, it allocates an additional Nexus instance to one of the eligible sleds, which is one of the three that already has one Nexus instance.

If this were a larger system with at least one "eligible for service zones" sled that did not already have a Nexus, the planner would've put the Nexus there instead of on a sled that already has a Nexus. In this system there are no such sleds, so it's forced to double up. I don't think it would be correct for the planner to attempt to give a new sled an NTP zone and a Nexus zone in the same plan generation, right?

jgallagher commented 9 months ago

Couple more thoughts:

If we could change the policy at runtime, we could run the reconfigurator enough times to set up the new sled with the policy at 3 Nexus instances to match RSS, then change the policy to require 4 Nexus instances. That ought to put the new Nexus on the one (now eligible) sled that doesn't have one. I don't know if we want to insert hooks for this kind of policy change yet, though, especially since "number of Nexus zones" is presumably something we don't expect operators to set?
We could special case this in the planner. Something like "if I think I need to allocate new Nexus zones, and all the eligible sleds already have a Nexus, and there is at least one ineligible sled that doesn't have a Nexus, I won't allocate any Nexuses this time, in hopes that in a future invocation one of those ineligible sleds becomes eligible". I haven't though enough about this to know whether it's always sensible.

andrewjstone commented 9 months ago

I had the same exact thought last night with respect to why this happened @jgallagher. Thanks for confirming.

While it's not strictly a bug in the current logic, the behavior leaves something to be desired on small clusters at least. I'm not sure how special of a case this actually is for production systems (or any large clusters) at least. For NTP, DNS, CRDB, Nexus, Clickhouse, and possibly more, we always want nodes/replicas on separate sleds. The only reason that doesn't happen in testing is due to the small cluster size of our testbeds and rack test systems. On a larger cluster we'd just go ahead and put the nexus node on one of the existing sleds without one.

So I think I concur with you that there's no bug here and no real reason to change the behavior. A special case could cause unanticipated consequences.

Another thought I had is that maybe we no longer need to order zones from the blueprint since https://github.com/oxidecomputer/omicron/pull/5012. I think this means we can go ahead and put all zones we want to run on an empty sled together in a single OmicronZonesRequest. I'm not sure if this would result in trying to use that nexus when it's not ready though. In practice that can always happen due to issues with a sled where nexus is running or the network though, so I don't think that's a big deal.

davepacheco commented 9 months ago

My understanding was that if the fourth node made it far enough to have its NTP one deployed, the planner would put the Nexus instance on that one rather than th eothers. So the edge case here really is: the operator started with a 3-node cluster and added a 4th. If they had a smaller cluster, the overlap in services would be unavoidable. If they had a larger one, they'd get Nexus instances on separate nodes during RSS (well, assuming the RSS policy was also N=4). So I agree it's not worth doing anything special for this.

andrewjstone commented 9 months ago

Closing as this is not a bug. However, I'm still curious about the following:

Another thought I had is that maybe we no longer need to order zones from the blueprint since https://github.com/oxidecomputer/omicron/pull/5012. I think this means we can go ahead and put all zones we want to run on an empty sled together in a single OmicronZonesRequest. I'm not sure if this would result in trying to use that nexus when it's not ready though. In practice that can always happen due to issues with a sled where nexus is running or the network though, so I don't think that's a big deal.

davepacheco commented 9 months ago

Yeah I'm curious about that too. I think the long-term behavior we talked about from sled agent is that we can give it all the zones, and it will start NTP first, wait for timesync, and start the rest. It could do this by doing it all in parallel, failing the ones that require timesync, and retrying. I'm not sure how #5012 changed all this, and particularly the behavior of rejecting requests with non-NTP zones before time is sync'd.

oxidecomputer / omicron

Blueprint placement of nexus puts it on a sled already containing nexus instead of new sled with no nexuses #5060