Closed andrewjstone closed 11 months ago
My non-gimlet config looks like the following. I just turned off the VMM reservoir since I'm not planning to launch VMs. I'm only testing early boot here.
# Sled Agent Configuration
# Identifies whether sled agent treats itself as a scrimlet or a gimlet.
#
# If this is set to "scrimlet", the sled agent treats itself as a scrimlet.
# If this is set to "gimlet", the sled agent treats itself as a gimlet.
# If this is set to "auto":
# - On illumos, the sled automatically detects whether or not it is a scrimlet.
# - On all other platforms, the sled assumes it is a gimlet.
sled_mode = "scrimlet"
# Identifies the revision of the sidecar that is attached, if one is attached.
# TODO: This field should be removed once Gimlets have the ability to auto-detect
# this information.
sidecar_revision.soft = { front_port_count = 1, rear_port_count = 1 }
# Setting this to true causes sled-agent to always report that its time is
# in-sync, rather than querying its NTP zone.
skip_timesync = false
# For testing purposes, A file-backed zpool can be manually created with the
# following:
#
# # truncate -s 10GB testpool.vdev
# # zpool create oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b "$PWD/testpool.vdev"
#
# Note that you'll need to create one such zpool for each below, with a
# different vdev for each. The `create_virtual_hardware.sh` script does this
# for you.
zpools = [
"oxi_a462a7f7-b628-40fe-80ff-4e4189e2d62b",
"oxi_b462a7f7-b628-40fe-80ff-4e4189e2d62b",
"oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b",
"oxp_e4b4dc87-ab46-49fb-a4b4-d361ae214c03",
"oxp_f4b4dc87-ab46-49fb-a4b4-d361ae214c03",
"oxp_14b4dc87-ab46-49fb-a4b4-d361ae214c03",
"oxp_24b4dc87-ab46-49fb-a4b4-d361ae214c03",
"oxp_cd70d7f6-2354-4bf2-8012-55bf9eaf7930",
"oxp_ceb4461c-cf56-4719-ad3c-14430bfdfb60",
"oxp_31bd71cd-4736-4a12-a387-9b74b050396f",
"oxp_616b26df-e62a-4c68-b506-f4a923d8aaf7",
]
# Percentage of usable physical DRAM to use for the VMM reservoir, which
# guest memory is pulled from.
vmm_reservoir_percentage = 0
# Swap device size for the system. The device is a sparsely allocated zvol on
# the internal zpool of the M.2 that we booted from.
#
# If use of the VMM reservoir is configured, it is likely the system will not
# work without a swap device configured.
swap_device_size_gb = 10
# An optional data link from which we extract a MAC address.
# This is used as a unique identifier for the bootstrap address.
#
# If empty, this will be equivalent to the first result from:
# $ dladm show-phys -p -o LINK
# data_link = "igb0"
# On a multi-sled system, transit-mode Maghemite runs in the `oxz_switch` zone
# to configure routes between sleds. This runs over the Sidecar's rear ports
# (whether simulated with SoftNPU or not). On a Gimlet deployed in a rack,
# tfportd will create the necessary links and Maghemite will be configured to
# use those. But on non-Gimlet systems, you need to specify physical links to
# be passed into the `oxz_switch` zone for this purpose. You can skip this if
# you're deploying a single-sled system and just leave the single tfportrear
# as-is.
switch_zone_maghemite_links = ["tfportrear0_0"]
data_links = ["net0", "net1"]
[log]
level = "info"
mode = "file"
path = "/dev/stdout"
if_exists = "append"
cannot create 'oxi_a462a7f7-b628-40fe-80ff-4e4189e2d62b/backing/fmd': parent does not exist
implies that oxi_a462a7f7-b628-40fe-80ff-4e4189e2d62b/backing
does not exist, and this should be created along with the other datasets on the internal pools by https://github.com/oxidecomputer/omicron/blob/main/sled-hardware/src/disk.rs#L297 as part of a call to sled_hardware::Disk::ensure_zpool_ready
Perhaps there's a race or something out of order here - the sled agent waits for storage.resources().boot_disk().await
before attempting to configure the swap and backing filesystems.
Looks like the backing dataset is missing from https://github.com/oxidecomputer/omicron/pull/4332/files#diff-11afed7dd7496540ac0f031b4f5f035281f52000b7c482d167f8fc4475688570R67-R87
@citrus-it Thank you so much. This was indeed a poor merge on my part!
I am running omicron as a single node on my helios box and am seeing that the backing fs is failing to get created. I'm on a branch with some significant changes, but none that should affect this AFAICT. I've opened a PR where I"m running into this: https://github.com/oxidecomputer/omicron/pull/4332