oxidecomputer / omicron

Omicron: Oxide control plane
Mozilla Public License 2.0
239 stars 34 forks source link

Sled Agent startup logging is confusing #1215

Open davepacheco opened 2 years ago

davepacheco commented 2 years ago

While setting up the stack a few weeks ago, I found the Sled Agent logging pretty confusing about the current state of things.

The first thing to know is that Sled Agent startup makes calls out to Nexus, but the first time it starts up, Nexus isn't running yet. Eventually, it will deploy the first Nexus instance.

I'll include the full log here for reference:

``` dap@sock ~/omicron $ BUNYAN_NO_PAGER=1 bunyan $(svcs -L sled-agent) [ Jun 1 23:07:29 Enabled. ] [ Jun 1 23:07:29 Executing start method ("ctrun -l child -o noorphan,regent /opt/oxide/sled-agent/sled-agent run /opt/oxide/sled-agent/pkg/config.toml &"). ] [ Jun 1 23:07:29 Method "start" exited with status 0. ] [ Jun 1 23:07:29 Rereading configuration. ] [ Jun 1 23:07:29 No 'refresh' method defined. Treating as :true. ] note: configured to log to "/dev/stdout" [2022-06-01T23:07:29.498928583Z] INFO: SledAgent/2909 on sock: setting up bootstrap agent server [2022-06-01T23:07:29.49930742Z] INFO: SledAgent/2909 on sock: Ensuring config directory exists (path=/var/oxide) [2022-06-01T23:07:29.525905774Z] INFO: SledAgent/dropshot (Bootstrap)/2909 on sock: listening (local_addr=[fdb0:18c0:4d0d:a02a::1]:12346) [2022-06-01T23:07:29.526285081Z] INFO: SledAgent/BootstrapAgent/2909 on sock: bootstrap service initializing (server=fb0f7546-4d46-40ca-9d56-cbb810684ca7) [2022-06-01T23:07:29.527375441Z] INFO: SledAgent/RSS/2909 on sock: Injecting RSS configuration: SetupServiceConfig { rack_subnet: fd00:1122:3344:100::, requests: [ SledRequest { datasets: [ DatasetEnsureBody { zpool_uuid: d462a7f7b62840fe80ff4e4189e2d62b, dataset_kind: Crucible, address: [fd00:1122:3344:101::6]:32345, }, DatasetEnsureBody { zpool_uuid: e4b4dc87ab4649fba4b4d361ae214c03, dataset_kind: Crucible, address: [fd00:1122:3344:101::7]:32345, }, DatasetEnsureBody { zpool_uuid: f4b4dc87ab4649fba4b4d361ae214c03, dataset_kind: Crucible, address: [fd00:1122:3344:101::8]:32345, }, DatasetEnsureBody { zpool_uuid: d462a7f7b62840fe80ff4e4189e2d62b, dataset_kind: CockroachDb { all_addresses: [ [fd00:1122:3344:101::2]:32221, ], }, address: [fd00:1122:3344:101::2]:32221, }, DatasetEnsureBody { zpool_uuid: d462a7f7b62840fe80ff4e4189e2d62b, dataset_kind: Clickhouse, address: [fd00:1122:3344:101::5]:8123, }, ], services: [ ServiceRequest { name: "nexus", addresses: [ fd00:1122:3344:101::3, ], gz_addresses: [], }, ServiceRequest { name: "oximeter", addresses: [ fd00:1122:3344:101::4, ], gz_addresses: [], }, ], dns_services: [], }, ], } [2022-06-01T23:07:29.527725438Z] INFO: SledAgent/RSS/2909 on sock: RSS configuration has not been fully applied yet [2022-06-01T23:07:29.527999026Z] INFO: SledAgent/RSS/2909 on sock: Enough peers exist to enact RSS plan [2022-06-01T23:07:29.528261924Z] INFO: SledAgent/RSS/2909 on sock: Creating new allocation plan [2022-06-01T23:07:29.528615701Z] INFO: SledAgent/RSS/2909 on sock: dns_subnets: [ DnsSubnet { subnet: Ipv6Subnet { net: Ipv6Net( Ipv6Network { addr: fd00:1122:3344:1::, prefix: 64, }, ), }, }, ] [2022-06-01T23:07:29.528911088Z] INFO: SledAgent/RSS/2909 on sock: Serializing plan [2022-06-01T23:07:29.529183685Z] INFO: SledAgent/RSS/2909 on sock: Creating plan for the sled at fdb0:18c0:4d0d:a02a::1 [2022-06-01T23:07:29.529955478Z] INFO: SledAgent/RSS/2909 on sock: Plan serialized as: ["[fdb0:18c0:4d0d:a02a::1]:12346".initialization_request.subnet] net = "fd00:1122:3344:101::/64" [["[fdb0:18c0:4d0d:a02a::1]:12346".services_request.dataset]] address = "[fd00:1122:3344:101::6]:32345" zpool_uuid = "d462a7f7-b628-40fe-80ff-4e4189e2d62b" ["[fdb0:18c0:4d0d:a02a::1]:12346".services_request.dataset.dataset_kind] type = "crucible" [["[fdb0:18c0:4d0d:a02a::1]:12346".services_request.dataset]] address = "[fd00:1122:3344:101::7]:32345" zpool_uuid = "e4b4dc87-ab46-49fb-a4b4-d361ae214c03" ["[fdb0:18c0:4d0d:a02a::1]:12346".services_request.dataset.dataset_kind] type = "crucible" [["[fdb0:18c0:4d0d:a02a::1]:12346".services_request.dataset]] address = "[fd00:1122:3344:101::8]:32345" zpool_uuid = "f4b4dc87-ab46-49fb-a4b4-d361ae214c03" ["[fdb0:18c0:4d0d:a02a::1]:12346".services_request.dataset.dataset_kind] type = "crucible" [["[fdb0:18c0:4d0d:a02a::1]:12346".services_request.dataset]] address = "[fd00:1122:3344:101::2]:32221" zpool_uuid = "d462a7f7-b628-40fe-80ff-4e4189e2d62b" ["[fdb0:18c0:4d0d:a02a::1]:12346".services_request.dataset.dataset_kind] all_addresses = ["[fd00:1122:3344:101::2]:32221"] type = "cockroach_db" [["[fdb0:18c0:4d0d:a02a::1]:12346".services_request.dataset]] address = "[fd00:1122:3344:101::5]:8123" zpool_uuid = "d462a7f7-b628-40fe-80ff-4e4189e2d62b" ["[fdb0:18c0:4d0d:a02a::1]:12346".services_request.dataset.dataset_kind] type = "clickhouse" [["[fdb0:18c0:4d0d:a02a::1]:12346".services_request.dns_service]] addresses = ["fd00:1122:3344:1::1"] gz_addresses = ["fd00:1122:3344:1::2"] name = "internal-dns" [["[fdb0:18c0:4d0d:a02a::1]:12346".services_request.service]] addresses = ["fd00:1122:3344:101::3"] gz_addresses = [] name = "nexus" [["[fdb0:18c0:4d0d:a02a::1]:12346".services_request.service]] addresses = ["fd00:1122:3344:101::4"] gz_addresses = [] name = "oximeter" [2022-06-01T23:07:29.530100767Z] INFO: SledAgent/RSS/2909 on sock: Plan written to storage [2022-06-01T23:07:29.530252186Z] INFO: SledAgent/BootstrapAgentRssHandler/2909 on sock: Received initialization request from RSS (target_sled=[fdb0:18c0:4d0d:a02a::1]:12346) request: SledAgentRequest { subnet: Ipv6Subnet { net: Ipv6Net(Ipv6Network { addr: fd00:1122:3344:101::, prefix: 64 }) } } [2022-06-01T23:07:29.537780049Z] INFO: SledAgent/BootstrapAgentRssHandler/2909 on sock: Sending request to peer agent: http://[fdb0:18c0:4d0d:a02a::1]:12346 [2022-06-01T23:07:29.538535593Z] INFO: SledAgent/dropshot (Bootstrap)/2909 on sock: accepted connection (local_addr=[fdb0:18c0:4d0d:a02a::1]:12346, remote_addr=[fdb0:18c0:4d0d:a02a::1]:60324) [2022-06-01T23:07:29.539598743Z] INFO: SledAgent/BootstrapAgent/2909 on sock: Loading Sled Agent: SledAgentRequest { subnet: Ipv6Subnet { net: Ipv6Net(Ipv6Network { addr: fd00:1122:3344:101::, prefix: 64 }) } } (server=fb0f7546-4d46-40ca-9d56-cbb810684ca7) [2022-06-01T23:07:29.539764542Z] INFO: SledAgent/2909 on sock: setting up sled agent server [2022-06-01T23:07:29.545187864Z] INFO: SledAgent/SledAgent/2909 on sock: created sled agent (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7) [2022-06-01T23:07:29.584224588Z] INFO: SledAgent/SledAgent/2909 on sock: using '[PhysicalLink("net0"), PhysicalLink("net1")]' as data links for xde driver (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7) [2022-06-01T23:07:29.724829094Z] WARN: SledAgent/SledAgent/2909 on sock: Deleting existing VNIC (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7, vnic_kind=OxideControl, vnic_name=oxControlService0) [2022-06-01T23:07:29.738362333Z] WARN: SledAgent/SledAgent/2909 on sock: Deleting existing VNIC (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7, vnic_kind=OxideControl, vnic_name=oxControlStorage0) [2022-06-01T23:07:29.754063645Z] WARN: SledAgent/SledAgent/2909 on sock: Deleting existing VNIC (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7, vnic_kind=OxideControl, vnic_name=oxControlStorage1) [2022-06-01T23:07:29.766831082Z] WARN: SledAgent/SledAgent/2909 on sock: Deleting existing VNIC (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7, vnic_kind=OxideControl, vnic_name=oxControlStorage2) [2022-06-01T23:07:29.779140633Z] WARN: SledAgent/SledAgent/2909 on sock: Deleting existing VNIC (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7, vnic_kind=OxideControl, vnic_name=oxControlStorage3) [2022-06-01T23:07:29.790549182Z] WARN: SledAgent/SledAgent/2909 on sock: Deleting existing VNIC (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7, vnic_kind=OxideControl, vnic_name=oxControlStorage5) [2022-06-01T23:07:29.805585209Z] WARN: SledAgent/SledAgent/2909 on sock: Deleting existing VNIC (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7, vnic_kind=OxideControl, vnic_name=oxControlService1) [2022-06-01T23:07:29.819949122Z] WARN: SledAgent/SledAgent/2909 on sock: Deleting existing VNIC (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7, vnic_kind=OxideControl, vnic_name=oxControlService2) [2022-06-01T23:07:29.83595521Z] WARN: SledAgent/SledAgent/2909 on sock: Deleting existing VNIC (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7, vnic_kind=OxideControl, vnic_name=oxControlInstance1) [2022-06-01T23:07:29.847069752Z] INFO: SledAgent/SledAgent/2909 on sock: Sled Agent upserting zpool to Storage Manager: oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7) [2022-06-01T23:07:29.85634795Z] INFO: SledAgent/SledAgent/2909 on sock: Sled Agent upserting zpool to Storage Manager: oxp_e4b4dc87-ab46-49fb-a4b4-d361ae214c03 (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7) [2022-06-01T23:07:29.866062884Z] INFO: SledAgent/SledAgent/2909 on sock: Sled Agent upserting zpool to Storage Manager: oxp_f4b4dc87-ab46-49fb-a4b4-d361ae214c03 (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7) [2022-06-01T23:07:29.878078798Z] INFO: SledAgent/ServiceManager/2909 on sock: No sled services found at /var/oxide/services.toml [2022-06-01T23:07:29.878531404Z] INFO: SledAgent/dropshot (SledAgent)/2909 on sock: listening (local_addr=[fd00:1122:3344:101::1]:12345) [2022-06-01T23:07:29.878791281Z] INFO: SledAgent/BootstrapAgent/2909 on sock: Sled Agent loaded; recording configuration (server=fb0f7546-4d46-40ca-9d56-cbb810684ca7) [2022-06-01T23:07:29.880930132Z] INFO: SledAgent/2909 on sock: contacting server nexus, registering sled: fb0f7546-4d46-40ca-9d56-cbb810684ca7 [2022-06-01T23:07:29.881547217Z] INFO: SledAgent/StorageManager/2909 on sock: Storage manager processing zpool: ZpoolInfo { name: "oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b", size: 9663676416, allocated: 105984, free: 9663570432, health: Online, } [2022-06-01T23:07:29.882482039Z] INFO: SledAgent/dropshot (Bootstrap)/2909 on sock: request completed (req_id=d7cc109d-1193-44c1-ba09-cbc38512f545, uri=/start_sled, method=PUT, remote_addr=[fdb0:18c0:4d0d:a02a::1]:60324, local_addr=[fdb0:18c0:4d0d:a02a::1]:12346, response_code=200) [2022-06-01T23:07:29.883345721Z] INFO: SledAgent/BootstrapAgentRssHandler/2909 on sock: Peer agent at http://[fdb0:18c0:4d0d:a02a::1]:12346 initialized [2022-06-01T23:07:29.88573007Z] INFO: SledAgent/BootstrapAgentRssHandler/2909 on sock: Initialized sled agent (target_sled=[fdb0:18c0:4d0d:a02a::1]:12346) [2022-06-01T23:07:29.890661886Z] INFO: SledAgent/RSS/2909 on sock: sending service requests... [2022-06-01T23:07:29.891013443Z] INFO: SledAgent/RSS/2909 on sock: initializing sled services: [ServiceRequest { name: "internal-dns", addresses: [fd00:1122:3344:1::1], gz_addresses: [fd00:1122:3344:1::2] }] [2022-06-01T23:07:29.893303833Z] INFO: SledAgent/dropshot (SledAgent)/2909 on sock: accepted connection (local_addr=[fd00:1122:3344:101::1]:12345, remote_addr=[fd00:1122:3344:101::1]:65156) [2022-06-01T23:07:29.893720379Z] INFO: SledAgent/StorageManager/2909 on sock: Storage manager processing zpool: ZpoolInfo { name: "oxp_e4b4dc87-ab46-49fb-a4b4-d361ae214c03", size: 9663676416, allocated: 104448, free: 9663571968, health: Online, } [2022-06-01T23:07:29.89470061Z] INFO: SledAgent/ServiceManager/2909 on sock: Ensuring services are initialized: [ServiceRequest { name: "internal-dns", addresses: [fd00:1122:3344:1::1], gz_addresses: [fd00:1122:3344:1::2] }] [2022-06-01T23:07:29.899493828Z] INFO: SledAgent/ServiceManager/2909 on sock: Service internal-dns does not yet exist [2022-06-01T23:07:29.904128317Z] INFO: SledAgent/StorageManager/2909 on sock: Storage manager processing zpool: ZpoolInfo { name: "oxp_f4b4dc87-ab46-49fb-a4b4-d361ae214c03", size: 9663676416, allocated: 540672, free: 9663135744, health: Online, } [2022-06-01T23:07:29.922388205Z] INFO: SledAgent/ServiceManager/2909 on sock: Configuring new Omicron zone: oxz_internal-dns [2022-06-01T23:07:29.942309849Z] INFO: SledAgent/ServiceManager/2909 on sock: Installing Omicron zone: oxz_internal-dns [2022-06-01T23:07:33.369817443Z] INFO: SledAgent/ServiceManager/2909 on sock: Zone booting (zone=oxz_internal-dns) [2022-06-01T23:07:37.563198118Z] INFO: SledAgent/ServiceManager/2909 on sock: Ensuring address fd00:1122:3344:1::1 exists [2022-06-01T23:07:37.563371876Z] INFO: SledAgent/ServiceManager/2909 on sock: Adding address: Static(V6(Ipv6Network { addr: fd00:1122:3344:1::1, prefix: 64 })) (zone=oxz_internal-dns) [2022-06-01T23:07:38.147611576Z] INFO: SledAgent/ServiceManager/2909 on sock: Ensuring address fd00:1122:3344:1::1 exists - OK [2022-06-01T23:07:38.14828473Z] INFO: SledAgent/ServiceManager/2909 on sock: GZ addresses: [ fd00:1122:3344:1::2, ] [2022-06-01T23:07:38.148801645Z] INFO: SledAgent/ServiceManager/2909 on sock: Ensuring GZ address fd00:1122:3344:1::2 exists [2022-06-01T23:07:40.491786898Z] INFO: SledAgent/ServiceManager/2909 on sock: halt_and_remove_logged: Previous zone state: Running (zone=oxz_internal-dns) [2022-06-01T23:07:40.495268997Z] INFO: SledAgent/ServiceManager/2909 on sock: Stopped and uninstalled zone (zone=oxz_internal-dns) [2022-06-01T23:07:40.504126388Z] INFO: SledAgent/dropshot (SledAgent)/2909 on sock: request completed (req_id=847eff8e-3986-4940-951a-1daedbc567db, uri=/services, method=PUT, remote_addr=[fd00:1122:3344:101::1]:65156, local_addr=[fd00:1122:3344:101::1]:12345, error_message_external="Internal Server Error", response_code=500) error_message_internal: Failed to do 'importing manifest' by running command in zone: Error running command in zone 'oxz_internal-dns': Command [/usr/sbin/zlogin oxz_internal-dns /usr/sbin/svccfg import /var/svc/manifest/site/internal-dns/manifest.xml] executed and failed with status: exit status: 1. Output: svccfg: Temporary service svc:/TEMP/system/illumos/internal-dns was deleted unexpectedly. svccfg: Import of /var/svc/manifest/site/internal-dns/manifest.xml failed. Progress: svccfg: Service "system/illumos/internal-dns": properties imported. svccfg: Instance "default": not reached. svccfg: Import of /var/svc/manifest/site/internal-dns/manifest.xml failed. [2022-06-01T23:07:40.50505965Z] WARN: SledAgent/RSS/2909 on sock: failed to initialize services (error="Error Response") [2022-06-01T23:07:40.856870037Z] INFO: SledAgent/RSS/2909 on sock: initializing sled services: [ServiceRequest { name: "internal-dns", addresses: [fd00:1122:3344:1::1], gz_addresses: [fd00:1122:3344:1::2] }] [2022-06-01T23:07:40.857262473Z] INFO: SledAgent/ServiceManager/2909 on sock: Ensuring services are initialized: [ServiceRequest { name: "internal-dns", addresses: [fd00:1122:3344:1::1], gz_addresses: [fd00:1122:3344:1::2] }] [2022-06-01T23:07:40.860653423Z] INFO: SledAgent/ServiceManager/2909 on sock: Service internal-dns does not yet exist [2022-06-01T23:07:40.879449847Z] INFO: SledAgent/ServiceManager/2909 on sock: Configuring new Omicron zone: oxz_internal-dns [2022-06-01T23:07:40.900456921Z] INFO: SledAgent/ServiceManager/2909 on sock: Installing Omicron zone: oxz_internal-dns [2022-06-01T23:07:43.579114912Z] INFO: SledAgent/ServiceManager/2909 on sock: Zone booting (zone=oxz_internal-dns) [2022-06-01T23:07:44.880626243Z] WARN: SledAgent/2909 on sock: failed to contact nexus, will retry in 248.569652ms [2022-06-01T23:07:44.892728996Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 251.890345ms [2022-06-01T23:07:44.9058831Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 236.424914ms [2022-06-01T23:07:44.920890217Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 363.68247ms [2022-06-01T23:07:45.131018307Z] INFO: SledAgent/2909 on sock: contacting server nexus, registering sled: fb0f7546-4d46-40ca-9d56-cbb810684ca7 [2022-06-01T23:07:48.056063908Z] INFO: SledAgent/ServiceManager/2909 on sock: Ensuring address fd00:1122:3344:1::1 exists [2022-06-01T23:07:48.057109509Z] INFO: SledAgent/ServiceManager/2909 on sock: Adding address: Static(V6(Ipv6Network { addr: fd00:1122:3344:1::1, prefix: 64 })) (zone=oxz_internal-dns) [2022-06-01T23:07:48.644162193Z] INFO: SledAgent/ServiceManager/2909 on sock: Ensuring address fd00:1122:3344:1::1 exists - OK [2022-06-01T23:07:48.647255215Z] INFO: SledAgent/ServiceManager/2909 on sock: GZ addresses: [ fd00:1122:3344:1::2, ] [2022-06-01T23:07:48.647533723Z] INFO: SledAgent/ServiceManager/2909 on sock: Ensuring GZ address fd00:1122:3344:1::2 exists [2022-06-01T23:07:48.953067619Z] INFO: SledAgent/ServiceManager/2909 on sock: Setting up internal-dns service [2022-06-01T23:07:49.134502953Z] INFO: SledAgent/dropshot (SledAgent)/2909 on sock: request completed (req_id=33d9884c-dbea-4236-8550-677a531ddfbe, uri=/services, method=PUT, remote_addr=[fd00:1122:3344:101::1]:65156, local_addr=[fd00:1122:3344:101::1]:12345, response_code=204) [2022-06-01T23:07:49.142928381Z] INFO: SledAgent/RSS/2909 on sock: sending dataset requests... [2022-06-01T23:07:49.14320227Z] INFO: SledAgent/RSS/2909 on sock: creating new filesystem: DatasetEnsureBody { zpool_uuid: d462a7f7-b628-40fe-80ff-4e4189e2d62b, dataset_kind: Crucible, address: [fd00:1122:3344:101::6]:32345 } [2022-06-01T23:07:49.143484764Z] INFO: SledAgent/dropshot (SledAgent)/2909 on sock: accepted connection (local_addr=[fd00:1122:3344:101::1]:12345, remote_addr=[fd00:1122:3344:101::1]:63166) [2022-06-01T23:07:49.146545356Z] INFO: SledAgent/StorageManager/2909 on sock: add_dataset: NewFilesystemRequest { zpool_id: d462a7f7-b628-40fe-80ff-4e4189e2d62b, dataset_kind: Crucible, address: [fd00:1122:3344:101::6]:32345, responder: Sender { inner: Some(Inner { state: State { is_complete: false, is_closed: false, is_rx_task_set: true, is_tx_task_set: false } }) } } [2022-06-01T23:07:49.146923333Z] INFO: SledAgent/StorageManager/2909 on sock: Ensuring dataset oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b/crucible exists [2022-06-01T23:07:49.230967819Z] INFO: SledAgent/StorageManager/2909 on sock: Ensuring zone for oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b/crucible is running [2022-06-01T23:07:49.248201686Z] INFO: SledAgent/StorageManager/2909 on sock: Zone for oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b/crucible was not found [2022-06-01T23:07:49.270982834Z] INFO: SledAgent/StorageManager/2909 on sock: Configuring new Omicron zone: oxz_crucible_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b [2022-06-01T23:07:49.298938367Z] INFO: SledAgent/StorageManager/2909 on sock: Installing Omicron zone: oxz_crucible_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b [2022-06-01T23:07:52.408185018Z] INFO: SledAgent/StorageManager/2909 on sock: Zone booting zone: oxz_crucible_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b [2022-06-01T23:07:54.58967506Z] INFO: SledAgent/StorageManager/2909 on sock: Adding address: Static(V6(Ipv6Network { addr: fd00:1122:3344:101::6, prefix: 64 })) zone: oxz_crucible_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b [2022-06-01T23:07:55.158758284Z] INFO: SledAgent/StorageManager/2909 on sock: Initializing Crucible [2022-06-01T23:07:55.945733258Z] INFO: SledAgent/StorageManager/2909 on sock: Zone oxz_crucible_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b with address [fd00:1122:3344:101::6]:32345 is running [2022-06-01T23:07:55.94666306Z] INFO: SledAgent/dropshot (SledAgent)/2909 on sock: request completed (req_id=d78b1247-3d12-4794-ac73-fbbf59d86e4f, uri=/filesystem, method=PUT, remote_addr=[fd00:1122:3344:101::1]:63166, local_addr=[fd00:1122:3344:101::1]:12345, response_code=204) [2022-06-01T23:07:55.950532206Z] INFO: SledAgent/RSS/2909 on sock: creating new filesystem: DatasetEnsureBody { zpool_uuid: e4b4dc87-ab46-49fb-a4b4-d361ae214c03, dataset_kind: Crucible, address: [fd00:1122:3344:101::7]:32345 } [2022-06-01T23:07:55.951088511Z] INFO: SledAgent/StorageManager/2909 on sock: add_dataset: NewFilesystemRequest { zpool_id: e4b4dc87-ab46-49fb-a4b4-d361ae214c03, dataset_kind: Crucible, address: [fd00:1122:3344:101::7]:32345, responder: Sender { inner: Some(Inner { state: State { is_complete: false, is_closed: false, is_rx_task_set: true, is_tx_task_set: false } }) } } [2022-06-01T23:07:55.951251949Z] INFO: SledAgent/StorageManager/2909 on sock: Ensuring dataset oxp_e4b4dc87-ab46-49fb-a4b4-d361ae214c03/crucible exists [2022-06-01T23:07:56.021671906Z] INFO: SledAgent/StorageManager/2909 on sock: Ensuring zone for oxp_e4b4dc87-ab46-49fb-a4b4-d361ae214c03/crucible is running [2022-06-01T23:07:56.031922155Z] INFO: SledAgent/StorageManager/2909 on sock: Zone for oxp_e4b4dc87-ab46-49fb-a4b4-d361ae214c03/crucible was not found [2022-06-01T23:07:56.056733536Z] INFO: SledAgent/StorageManager/2909 on sock: Configuring new Omicron zone: oxz_crucible_oxp_e4b4dc87-ab46-49fb-a4b4-d361ae214c03 [2022-06-01T23:07:56.094079095Z] INFO: SledAgent/StorageManager/2909 on sock: Installing Omicron zone: oxz_crucible_oxp_e4b4dc87-ab46-49fb-a4b4-d361ae214c03 [2022-06-01T23:07:59.173441751Z] INFO: SledAgent/StorageManager/2909 on sock: Zone booting zone: oxz_crucible_oxp_e4b4dc87-ab46-49fb-a4b4-d361ae214c03 [2022-06-01T23:08:00.131916877Z] WARN: SledAgent/2909 on sock: failed to contact nexus, will retry in 743.622125ms [2022-06-01T23:08:00.877123811Z] INFO: SledAgent/2909 on sock: contacting server nexus, registering sled: fb0f7546-4d46-40ca-9d56-cbb810684ca7 [2022-06-01T23:08:03.24196011Z] INFO: SledAgent/StorageManager/2909 on sock: Adding address: Static(V6(Ipv6Network { addr: fd00:1122:3344:101::7, prefix: 64 })) zone: oxz_crucible_oxp_e4b4dc87-ab46-49fb-a4b4-d361ae214c03 [2022-06-01T23:08:03.825864003Z] INFO: SledAgent/StorageManager/2909 on sock: Initializing Crucible [2022-06-01T23:08:04.350202462Z] INFO: SledAgent/StorageManager/2909 on sock: Zone oxz_crucible_oxp_e4b4dc87-ab46-49fb-a4b4-d361ae214c03 with address [fd00:1122:3344:101::7]:32345 is running [2022-06-01T23:08:04.350783057Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 276.531767ms [2022-06-01T23:08:04.351165643Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 423.249437ms [2022-06-01T23:08:04.351359371Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 390.258627ms [2022-06-01T23:08:04.353106606Z] INFO: SledAgent/dropshot (SledAgent)/2909 on sock: request completed (req_id=95b80cd2-aaea-481a-a6b5-395ceb097d42, uri=/filesystem, method=PUT, remote_addr=[fd00:1122:3344:101::1]:63166, local_addr=[fd00:1122:3344:101::1]:12345, response_code=204) [2022-06-01T23:08:04.353885469Z] INFO: SledAgent/RSS/2909 on sock: creating new filesystem: DatasetEnsureBody { zpool_uuid: f4b4dc87-ab46-49fb-a4b4-d361ae214c03, dataset_kind: Crucible, address: [fd00:1122:3344:101::8]:32345 } [2022-06-01T23:08:04.354124177Z] INFO: SledAgent/StorageManager/2909 on sock: add_dataset: NewFilesystemRequest { zpool_id: f4b4dc87-ab46-49fb-a4b4-d361ae214c03, dataset_kind: Crucible, address: [fd00:1122:3344:101::8]:32345, responder: Sender { inner: Some(Inner { state: State { is_complete: false, is_closed: false, is_rx_task_set: true, is_tx_task_set: false } }) } } [2022-06-01T23:08:04.354384785Z] INFO: SledAgent/StorageManager/2909 on sock: Ensuring dataset oxp_f4b4dc87-ab46-49fb-a4b4-d361ae214c03/crucible exists [2022-06-01T23:08:04.41698162Z] INFO: SledAgent/StorageManager/2909 on sock: Ensuring zone for oxp_f4b4dc87-ab46-49fb-a4b4-d361ae214c03/crucible is running [2022-06-01T23:08:04.425654484Z] INFO: SledAgent/StorageManager/2909 on sock: Zone for oxp_f4b4dc87-ab46-49fb-a4b4-d361ae214c03/crucible was not found [2022-06-01T23:08:04.443365947Z] INFO: SledAgent/StorageManager/2909 on sock: Configuring new Omicron zone: oxz_crucible_oxp_f4b4dc87-ab46-49fb-a4b4-d361ae214c03 [2022-06-01T23:08:04.466658871Z] INFO: SledAgent/StorageManager/2909 on sock: Installing Omicron zone: oxz_crucible_oxp_f4b4dc87-ab46-49fb-a4b4-d361ae214c03 [2022-06-01T23:08:07.655240749Z] INFO: SledAgent/StorageManager/2909 on sock: Zone booting zone: oxz_crucible_oxp_f4b4dc87-ab46-49fb-a4b4-d361ae214c03 [2022-06-01T23:08:11.880400614Z] INFO: SledAgent/StorageManager/2909 on sock: Adding address: Static(V6(Ipv6Network { addr: fd00:1122:3344:101::8, prefix: 64 })) zone: oxz_crucible_oxp_f4b4dc87-ab46-49fb-a4b4-d361ae214c03 [2022-06-01T23:08:12.460278541Z] INFO: SledAgent/StorageManager/2909 on sock: Initializing Crucible [2022-06-01T23:08:13.038084368Z] INFO: SledAgent/StorageManager/2909 on sock: Zone oxz_crucible_oxp_f4b4dc87-ab46-49fb-a4b4-d361ae214c03 with address [fd00:1122:3344:101::8]:32345 is running [2022-06-01T23:08:13.039354106Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 149.790028ms [2022-06-01T23:08:13.039546624Z] INFO: SledAgent/dropshot (SledAgent)/2909 on sock: request completed (req_id=f679f371-f3ca-48d2-9576-5ef57bf5b05d, uri=/filesystem, method=PUT, remote_addr=[fd00:1122:3344:101::1]:63166, local_addr=[fd00:1122:3344:101::1]:12345, response_code=204) [2022-06-01T23:08:13.039818662Z] INFO: SledAgent/RSS/2909 on sock: creating new filesystem: DatasetEnsureBody { zpool_uuid: d462a7f7-b628-40fe-80ff-4e4189e2d62b, dataset_kind: CockroachDb { all_addresses: [[fd00:1122:3344:101::2]:32221] }, address: [fd00:1122:3344:101::2]:32221 } [2022-06-01T23:08:13.040217589Z] INFO: SledAgent/StorageManager/2909 on sock: add_dataset: NewFilesystemRequest { zpool_id: d462a7f7-b628-40fe-80ff-4e4189e2d62b, dataset_kind: CockroachDb { all_addresses: [[fd00:1122:3344:101::2]:32221] }, address: [fd00:1122:3344:101::2]:32221, responder: Sender { inner: Some(Inner { state: State { is_complete: false, is_closed: false, is_rx_task_set: true, is_tx_task_set: false } }) } } [2022-06-01T23:08:13.040354397Z] INFO: SledAgent/StorageManager/2909 on sock: Ensuring dataset oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b/cockroachdb exists [2022-06-01T23:08:13.104639158Z] INFO: SledAgent/StorageManager/2909 on sock: Ensuring zone for oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b/cockroachdb is running [2022-06-01T23:08:13.121924266Z] INFO: SledAgent/StorageManager/2909 on sock: Zone for oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b/cockroachdb was not found [2022-06-01T23:08:13.137782965Z] INFO: SledAgent/StorageManager/2909 on sock: Configuring new Omicron zone: oxz_cockroachdb_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b [2022-06-01T23:08:13.160533254Z] INFO: SledAgent/StorageManager/2909 on sock: Installing Omicron zone: oxz_cockroachdb_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b [2022-06-01T23:08:16.931955193Z] INFO: SledAgent/StorageManager/2909 on sock: Zone booting zone: oxz_cockroachdb_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b [2022-06-01T23:08:17.161805959Z] WARN: SledAgent/2909 on sock: failed to contact nexus, will retry in 770.765063ms [2022-06-01T23:08:17.933208632Z] INFO: SledAgent/2909 on sock: contacting server nexus, registering sled: fb0f7546-4d46-40ca-9d56-cbb810684ca7 [2022-06-01T23:08:19.277337814Z] INFO: SledAgent/StorageManager/2909 on sock: Adding address: Static(V6(Ipv6Network { addr: fd00:1122:3344:101::2, prefix: 64 })) zone: oxz_cockroachdb_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b [2022-06-01T23:08:19.847946155Z] INFO: SledAgent/StorageManager/2909 on sock: start_zone: Loading CRDB manifest [2022-06-01T23:08:20.185842614Z] INFO: SledAgent/StorageManager/2909 on sock: start_zone: setting CRDB's config/listen_addr: [fd00:1122:3344:101::2]:32221 [2022-06-01T23:08:20.231946275Z] INFO: SledAgent/StorageManager/2909 on sock: start_zone: setting CRDB's config/store [2022-06-01T23:08:20.269034457Z] INFO: SledAgent/StorageManager/2909 on sock: start_zone: setting CRDB's config/join_addrs [2022-06-01T23:08:20.305499214Z] INFO: SledAgent/StorageManager/2909 on sock: start_zone: refreshing manifest [2022-06-01T23:08:20.337933087Z] INFO: SledAgent/StorageManager/2909 on sock: start_zone: enabling CRDB service [2022-06-01T23:08:20.374348425Z] INFO: SledAgent/StorageManager/2909 on sock: start_zone: awaiting liveness of CRDB [2022-06-01T23:08:20.380744728Z] WARN: SledAgent/StorageManager/2909 on sock: cockroachdb not yet alive [2022-06-01T23:08:20.648632498Z] WARN: SledAgent/StorageManager/2909 on sock: cockroachdb not yet alive [2022-06-01T23:08:21.277440112Z] WARN: SledAgent/StorageManager/2909 on sock: cockroachdb not yet alive [2022-06-01T23:08:22.516087249Z] WARN: SledAgent/StorageManager/2909 on sock: cockroachdb not yet alive [2022-06-01T23:08:25.273275485Z] INFO: SledAgent/StorageManager/2909 on sock: CRDB is online [2022-06-01T23:08:25.273634562Z] INFO: SledAgent/StorageManager/2909 on sock: Formatting CRDB [2022-06-01T23:08:32.935085953Z] WARN: SledAgent/2909 on sock: failed to contact nexus, will retry in 2.561089402s [2022-06-01T23:08:33.241793708Z] INFO: SledAgent/StorageManager/2909 on sock: Formatting CRDB - Completed [2022-06-01T23:08:33.242270573Z] INFO: SledAgent/StorageManager/2909 on sock: Zone oxz_cockroachdb_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b with address [fd00:1122:3344:101::2]:32221 is running [2022-06-01T23:08:33.242896058Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 142.776439ms [2022-06-01T23:08:33.243337664Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 507.632943ms [2022-06-01T23:08:33.243496923Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 1.182980646s [2022-06-01T23:08:33.243644801Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 830.830619ms [2022-06-01T23:08:33.24379535Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 182.536478ms [2022-06-01T23:08:33.243937299Z] INFO: SledAgent/dropshot (SledAgent)/2909 on sock: request completed (req_id=bfcf110c-d574-4862-8d48-85d241efcc5d, uri=/filesystem, method=PUT, remote_addr=[fd00:1122:3344:101::1]:63166, local_addr=[fd00:1122:3344:101::1]:12345, response_code=204) [2022-06-01T23:08:33.248757376Z] INFO: SledAgent/RSS/2909 on sock: creating new filesystem: DatasetEnsureBody { zpool_uuid: d462a7f7-b628-40fe-80ff-4e4189e2d62b, dataset_kind: Clickhouse, address: [fd00:1122:3344:101::5]:8123 } [2022-06-01T23:08:33.248972164Z] INFO: SledAgent/StorageManager/2909 on sock: add_dataset: NewFilesystemRequest { zpool_id: d462a7f7-b628-40fe-80ff-4e4189e2d62b, dataset_kind: Clickhouse, address: [fd00:1122:3344:101::5]:8123, responder: Sender { inner: Some(Inner { state: State { is_complete: false, is_closed: false, is_rx_task_set: true, is_tx_task_set: false } }) } } [2022-06-01T23:08:33.249200072Z] INFO: SledAgent/StorageManager/2909 on sock: Ensuring dataset oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b/clickhouse exists [2022-06-01T23:08:33.342696875Z] INFO: SledAgent/StorageManager/2909 on sock: Ensuring zone for oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b/clickhouse is running [2022-06-01T23:08:33.359413867Z] INFO: SledAgent/StorageManager/2909 on sock: Zone for oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b/clickhouse was not found [2022-06-01T23:08:33.392184627Z] INFO: SledAgent/StorageManager/2909 on sock: Configuring new Omicron zone: oxz_clickhouse_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b [2022-06-01T23:08:33.433344742Z] INFO: SledAgent/StorageManager/2909 on sock: Installing Omicron zone: oxz_clickhouse_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b [2022-06-01T23:08:37.485661716Z] INFO: SledAgent/StorageManager/2909 on sock: Zone booting zone: oxz_clickhouse_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b [2022-06-01T23:08:37.726780562Z] INFO: SledAgent/2909 on sock: contacting server nexus, registering sled: fb0f7546-4d46-40ca-9d56-cbb810684ca7 [2022-06-01T23:08:41.725179224Z] INFO: SledAgent/StorageManager/2909 on sock: Adding address: Static(V6(Ipv6Network { addr: fd00:1122:3344:101::5, prefix: 64 })) zone: oxz_clickhouse_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b [2022-06-01T23:08:42.336431363Z] INFO: SledAgent/StorageManager/2909 on sock: Initialiting Clickhouse [2022-06-01T23:08:42.668654183Z] INFO: SledAgent/StorageManager/2909 on sock: Zone oxz_clickhouse_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b with address [fd00:1122:3344:101::5]:8123 is running [2022-06-01T23:08:42.670020961Z] INFO: SledAgent/dropshot (SledAgent)/2909 on sock: request completed (req_id=cd542455-7c9b-40bb-9093-82ba3401a41f, uri=/filesystem, method=PUT, remote_addr=[fd00:1122:3344:101::1]:63166, local_addr=[fd00:1122:3344:101::1]:12345, response_code=204) [2022-06-01T23:08:42.671626107Z] INFO: SledAgent/RSS/2909 on sock: Finished setting up agents and datasets [2022-06-01T23:08:42.677206657Z] INFO: SledAgent/RSS/2909 on sock: sending service requests... [2022-06-01T23:08:42.677447995Z] INFO: SledAgent/RSS/2909 on sock: initializing sled services: [ServiceRequest { name: "nexus", addresses: [fd00:1122:3344:101::3], gz_addresses: [] }, ServiceRequest { name: "oximeter", addresses: [fd00:1122:3344:101::4], gz_addresses: [] }, ServiceRequest { name: "internal-dns", addresses: [fd00:1122:3344:1::1], gz_addresses: [fd00:1122:3344:1::2] }] [2022-06-01T23:08:42.677640383Z] INFO: SledAgent/dropshot (SledAgent)/2909 on sock: accepted connection (local_addr=[fd00:1122:3344:101::1]:12345, remote_addr=[fd00:1122:3344:101::1]:34451) [2022-06-01T23:08:42.677940671Z] INFO: SledAgent/ServiceManager/2909 on sock: Ensuring services are initialized: [ServiceRequest { name: "oximeter", addresses: [fd00:1122:3344:101::4], gz_addresses: [] }, ServiceRequest { name: "nexus", addresses: [fd00:1122:3344:101::3], gz_addresses: [] }] [2022-06-01T23:08:42.68136024Z] INFO: SledAgent/ServiceManager/2909 on sock: Service oximeter does not yet exist [2022-06-01T23:08:42.700278763Z] INFO: SledAgent/ServiceManager/2909 on sock: Configuring new Omicron zone: oxz_oximeter [2022-06-01T23:08:42.720659942Z] INFO: SledAgent/ServiceManager/2909 on sock: Installing Omicron zone: oxz_oximeter [2022-06-01T23:08:45.578458419Z] INFO: SledAgent/ServiceManager/2909 on sock: Zone booting (zone=oxz_oximeter) [2022-06-01T23:08:48.056022681Z] INFO: SledAgent/ServiceManager/2909 on sock: Ensuring address fd00:1122:3344:101::4 exists [2022-06-01T23:08:48.058984814Z] INFO: SledAgent/ServiceManager/2909 on sock: Adding address: Static(V6(Ipv6Network { addr: fd00:1122:3344:101::4, prefix: 64 })) (zone=oxz_oximeter) [2022-06-01T23:08:48.654487674Z] INFO: SledAgent/ServiceManager/2909 on sock: Ensuring address fd00:1122:3344:101::4 exists - OK [2022-06-01T23:08:48.655124928Z] INFO: SledAgent/ServiceManager/2909 on sock: GZ addresses: [] [2022-06-01T23:08:49.003460771Z] INFO: SledAgent/ServiceManager/2909 on sock: Service name oximeter did not match [2022-06-01T23:08:49.043794428Z] INFO: SledAgent/ServiceManager/2909 on sock: Service nexus does not yet exist [2022-06-01T23:08:49.069256893Z] INFO: SledAgent/ServiceManager/2909 on sock: Configuring new Omicron zone: oxz_nexus [2022-06-01T23:08:49.094927246Z] INFO: SledAgent/ServiceManager/2909 on sock: Installing Omicron zone: oxz_nexus [2022-06-01T23:08:52.041879133Z] INFO: SledAgent/ServiceManager/2909 on sock: Zone booting (zone=oxz_nexus) [2022-06-01T23:08:52.23645132Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 652.807777ms [2022-06-01T23:08:52.236619528Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 139.282019ms [2022-06-01T23:08:52.728457986Z] WARN: SledAgent/2909 on sock: failed to contact nexus, will retry in 3.71501894s [2022-06-01T23:08:55.955553263Z] INFO: SledAgent/ServiceManager/2909 on sock: Ensuring address fd00:1122:3344:101::3 exists [2022-06-01T23:08:55.9582053Z] INFO: SledAgent/ServiceManager/2909 on sock: Adding address: Static(V6(Ipv6Network { addr: fd00:1122:3344:101::3, prefix: 64 })) (zone=oxz_nexus) [2022-06-01T23:08:56.541255089Z] INFO: SledAgent/ServiceManager/2909 on sock: Ensuring address fd00:1122:3344:101::3 exists - OK [2022-06-01T23:08:56.541894304Z] INFO: SledAgent/ServiceManager/2909 on sock: GZ addresses: [] [2022-06-01T23:08:58.90935488Z] INFO: SledAgent/ServiceManager/2909 on sock: halt_and_remove_logged: Previous zone state: Running (zone=oxz_nexus) [2022-06-01T23:08:58.910103653Z] INFO: SledAgent/ServiceManager/2909 on sock: Stopped and uninstalled zone (zone=oxz_nexus) [2022-06-01T23:08:58.919284312Z] INFO: SledAgent/dropshot (SledAgent)/2909 on sock: request completed (req_id=71c8cbc1-5c48-4d47-817e-d002d8dcb229, uri=/services, method=PUT, remote_addr=[fd00:1122:3344:101::1]:34451, local_addr=[fd00:1122:3344:101::1]:12345, error_message_external="Internal Server Error", response_code=500) error_message_internal: Failed to do 'importing manifest' by running command in zone: Error running command in zone 'oxz_nexus': Command [/usr/sbin/zlogin oxz_nexus /usr/sbin/svccfg import /var/svc/manifest/site/nexus/manifest.xml] executed and failed with status: exit status: 1. Output: svccfg: Temporary service svc:/TEMP/system/illumos/nexus was deleted unexpectedly. svccfg: Import of /var/svc/manifest/site/nexus/manifest.xml failed. Progress: svccfg: Service "system/illumos/nexus": properties imported. svccfg: Instance "default": not reached. svccfg: Import of /var/svc/manifest/site/nexus/manifest.xml failed. [2022-06-01T23:08:58.919892947Z] INFO: SledAgent/2909 on sock: contacting server nexus, registering sled: fb0f7546-4d46-40ca-9d56-cbb810684ca7 [2022-06-01T23:08:58.920041305Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 560.822476ms [2022-06-01T23:08:58.920193784Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 548.123862ms [2022-06-01T23:08:58.920328583Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 1.148798026s [2022-06-01T23:08:58.920460962Z] WARN: SledAgent/RSS/2909 on sock: failed to initialize services (error="Error Response") [2022-06-01T23:08:58.92061501Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 1.836863505s [2022-06-01T23:08:58.920749389Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 132.370862ms [2022-06-01T23:08:58.920882558Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 2.602310766s [2022-06-01T23:08:59.241694148Z] INFO: SledAgent/RSS/2909 on sock: initializing sled services: [ServiceRequest { name: "nexus", addresses: [fd00:1122:3344:101::3], gz_addresses: [] }, ServiceRequest { name: "oximeter", addresses: [fd00:1122:3344:101::4], gz_addresses: [] }, ServiceRequest { name: "internal-dns", addresses: [fd00:1122:3344:1::1], gz_addresses: [fd00:1122:3344:1::2] }] [2022-06-01T23:08:59.242233643Z] INFO: SledAgent/ServiceManager/2909 on sock: Ensuring services are initialized: [ServiceRequest { name: "oximeter", addresses: [fd00:1122:3344:101::4], gz_addresses: [] }, ServiceRequest { name: "nexus", addresses: [fd00:1122:3344:101::3], gz_addresses: [] }] [2022-06-01T23:08:59.245621623Z] INFO: SledAgent/ServiceManager/2909 on sock: Service oximeter already exists [2022-06-01T23:08:59.246092369Z] INFO: SledAgent/ServiceManager/2909 on sock: Service nexus does not yet exist [2022-06-01T23:08:59.27534333Z] INFO: SledAgent/ServiceManager/2909 on sock: Configuring new Omicron zone: oxz_nexus [2022-06-01T23:08:59.302681598Z] INFO: SledAgent/ServiceManager/2909 on sock: Installing Omicron zone: oxz_nexus [2022-06-01T23:09:02.106709081Z] INFO: SledAgent/ServiceManager/2909 on sock: Zone booting (zone=oxz_nexus) [2022-06-01T23:09:05.291114596Z] INFO: SledAgent/ServiceManager/2909 on sock: Ensuring address fd00:1122:3344:101::3 exists [2022-06-01T23:09:05.291311424Z] INFO: SledAgent/ServiceManager/2909 on sock: Adding address: Static(V6(Ipv6Network { addr: fd00:1122:3344:101::3, prefix: 64 })) (zone=oxz_nexus) [2022-06-01T23:09:05.885177668Z] INFO: SledAgent/ServiceManager/2909 on sock: Ensuring address fd00:1122:3344:101::3 exists - OK [2022-06-01T23:09:05.885370776Z] INFO: SledAgent/ServiceManager/2909 on sock: GZ addresses: [] [2022-06-01T23:09:06.234552896Z] INFO: SledAgent/ServiceManager/2909 on sock: Service name nexus did not match [2022-06-01T23:09:06.268590555Z] INFO: SledAgent/dropshot (SledAgent)/2909 on sock: request completed (req_id=c6019410-9266-47a4-9137-d529b8d52262, uri=/services, method=PUT, remote_addr=[fd00:1122:3344:101::1]:34451, local_addr=[fd00:1122:3344:101::1]:12345, response_code=204) [2022-06-01T23:09:06.269025041Z] WARN: SledAgent/2909 on sock: failed to contact nexus, will retry in 7.858261115s [2022-06-01T23:09:06.269248009Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 528.435574ms [2022-06-01T23:09:06.269491647Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 262.229898ms [2022-06-01T23:09:06.269713825Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.443517337s [2022-06-01T23:09:06.269926603Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.126481259s [2022-06-01T23:09:06.270805385Z] INFO: SledAgent/RSS/2909 on sock: Finished setting up services [2022-06-01T23:09:06.5323965Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.434758814s [2022-06-01T23:09:06.684688782Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.086800326s [2022-06-01T23:09:06.744591962Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 5.570363997s [2022-06-01T23:09:06.798946621Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 780.561714ms [2022-06-01T23:09:07.397460443Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.908596339s [2022-06-01T23:09:07.434737783Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 2.628760681s [2022-06-01T23:09:07.58082102Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 2.415066186s [2022-06-01T23:09:07.71414967Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 2.12780999s [2022-06-01T23:09:07.772264856Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.046736717s [2022-06-01T23:09:07.998236746Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.076939718s [2022-06-01T23:09:08.008894782Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 3.75894106s [2022-06-01T23:09:08.85081895Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 3.7597818s [2022-06-01T23:09:09.098758456Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 2.197342419s [2022-06-01T23:09:09.312828691Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 2.627305181s [2022-06-01T23:09:09.84729118Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 2.506179068s [2022-06-01T23:09:10.001307847Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 4.059281855s [2022-06-01T23:09:10.068903139Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 10.554439345s [2022-06-01T23:09:11.303407502Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 11.595760471s [2022-06-01T23:09:11.77377509Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 10.420023943s [2022-06-01T23:09:11.947141075Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 11.456405535s [2022-06-01T23:09:12.320869017Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 4.439781817s [2022-06-01T23:09:12.358692392Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 8.392026948s [2022-06-01T23:09:12.616784788Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 11.102997133s [2022-06-01T23:09:14.066622206Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 7.582159334s [2022-06-01T23:09:14.128674377Z] INFO: SledAgent/2909 on sock: contacting server nexus, registering sled: fb0f7546-4d46-40ca-9d56-cbb810684ca7 [2022-06-01T23:09:21.65416898Z] WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 15.71140455s ```

What I found most confusing are those messages at the end:

[2022-06-01T23:08:59.302681598Z]  INFO: SledAgent/ServiceManager/2909 on sock: Installing Omicron zone: oxz_nexus
[2022-06-01T23:09:02.106709081Z]  INFO: SledAgent/ServiceManager/2909 on sock: Zone booting (zone=oxz_nexus)
[2022-06-01T23:09:05.291114596Z]  INFO: SledAgent/ServiceManager/2909 on sock: Ensuring address fd00:1122:3344:101::3 exists
[2022-06-01T23:09:05.291311424Z]  INFO: SledAgent/ServiceManager/2909 on sock: Adding address: Static(V6(Ipv6Network { addr: fd00:1122:3344:101::3, prefix: 64 })) (zone=oxz_nexus)
[2022-06-01T23:09:05.885177668Z]  INFO: SledAgent/ServiceManager/2909 on sock: Ensuring address fd00:1122:3344:101::3 exists - OK
[2022-06-01T23:09:05.885370776Z]  INFO: SledAgent/ServiceManager/2909 on sock: GZ addresses: []
[2022-06-01T23:09:06.234552896Z]  INFO: SledAgent/ServiceManager/2909 on sock: Service name nexus did not match
[2022-06-01T23:09:06.268590555Z]  INFO: SledAgent/dropshot (SledAgent)/2909 on sock: request completed (req_id=c6019410-9266-47a4-9137-d529b8d52262, uri=/services, method=PUT, remote_addr=[fd00:1122:3344:101::1]:34451, local_addr=[fd00:1122:3344:101::1]:12345, response_code=204)
[2022-06-01T23:09:06.269025041Z]  WARN: SledAgent/2909 on sock: failed to contact nexus, will retry in 7.858261115s
[2022-06-01T23:09:06.269248009Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 528.435574ms
[2022-06-01T23:09:06.269491647Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 262.229898ms
[2022-06-01T23:09:06.269713825Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.443517337s
[2022-06-01T23:09:06.269926603Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.126481259s
[2022-06-01T23:09:06.270805385Z]  INFO: SledAgent/RSS/2909 on sock: Finished setting up services
[2022-06-01T23:09:06.5323965Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.434758814s
[2022-06-01T23:09:06.684688782Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.086800326s
[2022-06-01T23:09:06.744591962Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 5.570363997s
[2022-06-01T23:09:06.798946621Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 780.561714ms
[2022-06-01T23:09:07.397460443Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.908596339s
[2022-06-01T23:09:07.434737783Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 2.628760681s
[2022-06-01T23:09:07.58082102Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 2.415066186s
[2022-06-01T23:09:07.71414967Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 2.12780999s
[2022-06-01T23:09:07.772264856Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.046736717s
[2022-06-01T23:09:07.998236746Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.076939718s
[2022-06-01T23:09:08.008894782Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 3.75894106s
[2022-06-01T23:09:08.85081895Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 3.7597818s
[2022-06-01T23:09:09.098758456Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 2.197342419s
[2022-06-01T23:09:09.312828691Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 2.627305181s
[2022-06-01T23:09:09.84729118Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 2.506179068s
[2022-06-01T23:09:10.001307847Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 4.059281855s
[2022-06-01T23:09:10.068903139Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 10.554439345s
[2022-06-01T23:09:11.303407502Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 11.595760471s
[2022-06-01T23:09:11.77377509Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 10.420023943s
[2022-06-01T23:09:11.947141075Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 11.456405535s
[2022-06-01T23:09:12.320869017Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 4.439781817s
[2022-06-01T23:09:12.358692392Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 8.392026948s
[2022-06-01T23:09:12.616784788Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 11.102997133s
[2022-06-01T23:09:14.066622206Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 7.582159334s
[2022-06-01T23:09:14.128674377Z]  INFO: SledAgent/2909 on sock: contacting server nexus, registering sled: fb0f7546-4d46-40ca-9d56-cbb810684ca7
[2022-06-01T23:09:21.65416898Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 15.71140455s

As I looked at this, it was five minutes later:

$ date
June  1, 2022 at 11:14:23 PM UTC

I think there are a few problems here:

jgallagher commented 2 years ago

I've only been in code adjacent to this recently, but I suspect logs like this:

[2022-06-01T23:09:06.269248009Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 528.435574ms
[2022-06-01T23:09:06.269491647Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 262.229898ms

are two independent dataset retry loops running, not one with a weirdly-decreasing interval. Big :+1: on including more context here to clarify, though!