Open jordanhendricks opened 1 year ago
I believe I've come across a related issue. I was trying to build a helios-2 iso image as follows.
VARIANT=base ./strap.sh -f
VARIANT=ramdisk ./strap.sh -f
./ufs.sh
SERIAL=vga ./iso.sh
This resulted in the following log output.
Jul 20 17:31:23.216 INFO temporary dataset: rpool/images/tmp/helios/eltorito-efi
Jul 20 17:31:23.225 INFO DESTROY DATASET: rpool/images/tmp/helios/eltorito-efi
Jul 20 17:31:23.288 INFO CREATE DATASET: rpool/images/tmp/helios/eltorito-efi
Jul 20 17:31:23.342 INFO SET DATASET PROPERTY ON rpool/images/tmp/helios/eltorito-efi: sync = disabled
Jul 20 17:31:23.398 INFO temporary directory: /rpool/images/tmp/helios/eltorito-efi
Jul 20 17:31:23.398 INFO work dataset: rpool/images/work/helios/eltorito-efi
Jul 20 17:31:23.411 INFO SET DATASET PROPERTY ON rpool/images/work/helios/eltorito-efi: sync = disabled
Jul 20 17:31:23.470 INFO nothing mounted at "/rpool/images/work/helios/eltorito-efi/a"
Jul 20 17:31:23.483 INFO image file: /rpool/images/work/helios/eltorito-efi/lofi.pcfs
Jul 20 17:31:23.483 INFO lofi exists LofiDevice { id: 1, readonly: false, labeled: false, encrypted: false, filename: "/rpool/images/work/helios/eltorito-efi/lofi.pcfs", devpath: Some("/dev/lofi/1"), rdevpath: Some("/dev/rlofi/1") } -- removing...
Jul 20 17:31:23.483 CRIT fatal error: could not stat path: "/dev/lofi/1": No such file or directory (os error 2)
I also noticed at this point that lofiadm
would hang for around 2 minutes before returning a result. This result did show what appear to be stale entries hanging around.
ry@masaka:~/src/helios-engvm/image$ pfexec lofiadm
Block Device File Options
/dev/lofi/1 /rpool/images/work/helios/eltorito-efi/lofi.pcfs -
/dev/dsk/c4t2d0p0 /rpool/images/work/helios/generic-ttya-full/lofi.raw Labeled
At this point, there was not actually a file /dev/lofi/1
. I could delete the entries with lofiadm -d <device>
and that would allow things to get a bit further, but then image-builder
would hang on another lofi
call and ultimately crash.
This led to getting in a state where lofiadm was returning
unknown` entries.
ry@masaka:~$ pfexec lofiadm
Block Device File Options
unknown /rpool/images/work/helios/generic-vga-iso/output.iso Labeled
/dev/lofi/2 /rpool/images/work/helios/eltorito-efi/lofi.pcfs -
@jclulow had me look at the following.
fmdump -IVp -c 'resource.sysevent.EC_dev_*' -n 'driver_name=lofi'
which showed some storage behavior
Dec 12 2022 16:22:32.702689435 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@1
dev_name = /dev/rdsk/c4t1d0
driver_name = lofi
instance = 1
class = resource.sysevent.EC_dev_remove.disk
version = 0x0
__ttl = 0x1
__tod = 0x6397c5c8 0x29e2309b
Dec 12 2022 16:22:32.703001387 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@1
dev_name = /dev/rdsk/c4t1d0
driver_name = lofi
instance = 1
class = resource.sysevent.EC_dev_remove.disk
version = 0x0
__ttl = 0x1
__tod = 0x6397c5c8 0x29e6f32b
Dec 12 2022 16:22:32.703276047 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@1
dev_name = /dev/rdsk/c4t1d0
driver_name = lofi
instance = 1
class = resource.sysevent.EC_dev_remove.disk
version = 0x0
__ttl = 0x1
__tod = 0x6397c5c8 0x29eb240f
Dec 12 2022 16:22:32.703534911 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@1
dev_name = /dev/rdsk/c4t1d0
driver_name = lofi
instance = 1
class = resource.sysevent.EC_dev_remove.disk
version = 0x0
__ttl = 0x1
__tod = 0x6397c5c8 0x29ef173f
Dec 12 2022 16:22:32.703785103 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@1
dev_name = /dev/rdsk/c4t1d0
driver_name = lofi
instance = 1
class = resource.sysevent.EC_dev_remove.disk
version = 0x0
__ttl = 0x1
__tod = 0x6397c5c8 0x29f2e88f
Dec 12 2022 16:22:32.704027682 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@1
dev_name = /dev/rdsk/c4t1d0
driver_name = lofi
instance = 1
class = resource.sysevent.EC_dev_remove.disk
version = 0x0
__ttl = 0x1
__tod = 0x6397c5c8 0x29f69c22
Dec 12 2022 16:22:32.704252953 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@1
dev_name = /dev/rdsk/c4t1d0
driver_name = lofi
instance = 1
class = resource.sysevent.EC_dev_remove.disk
version = 0x0
__ttl = 0x1
__tod = 0x6397c5c8 0x29fa0c19
Dec 12 2022 16:22:32.704467186 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@1
dev_name = /dev/rdsk/c4t1d0
driver_name = lofi
instance = 1
class = resource.sysevent.EC_dev_remove.disk
version = 0x0
__ttl = 0x1
__tod = 0x6397c5c8 0x29fd50f2
Dec 12 2022 16:22:32.704667303 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@1
dev_name = /dev/rdsk/c4t1d0
driver_name = lofi
instance = 1
class = resource.sysevent.EC_dev_remove.disk
version = 0x0
__ttl = 0x1
__tod = 0x6397c5c8 0x2a005ea7
Dec 12 2022 16:22:32.704857210 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@1
dev_name = /dev/rdsk/c4t1d0
driver_name = lofi
instance = 1
class = resource.sysevent.EC_dev_remove.disk
version = 0x0
__ttl = 0x1
__tod = 0x6397c5c8 0x2a03447a
Dec 12 2022 16:22:32.705041754 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@1
dev_name = /dev/rdsk/c4t1d0
driver_name = lofi
instance = 1
class = resource.sysevent.EC_dev_remove.disk
version = 0x0
__ttl = 0x1
__tod = 0x6397c5c8 0x2a06155a
Dec 12 2022 16:22:32.705209568 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@1
dev_name = /dev/rdsk/c4t1d0
driver_name = lofi
instance = 1
class = resource.sysevent.EC_dev_remove.disk
version = 0x0
__ttl = 0x1
__tod = 0x6397c5c8 0x2a08a4e0
Dec 12 2022 16:22:32.705371408 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@1
dev_name = /dev/rdsk/c4t1d0
driver_name = lofi
instance = 1
class = resource.sysevent.EC_dev_remove.disk
version = 0x0
__ttl = 0x1
__tod = 0x6397c5c8 0x2a0b1d10
Dec 12 2022 16:22:32.705521513 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@1
dev_name = /dev/rdsk/c4t1d0
driver_name = lofi
instance = 1
class = resource.sysevent.EC_dev_remove.disk
version = 0x0
__ttl = 0x1
__tod = 0x6397c5c8 0x2a0d6769
Dec 12 2022 16:22:32.705652682 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@1
dev_name = /dev/rdsk/c4t1d0
driver_name = lofi
instance = 1
class = resource.sysevent.EC_dev_remove.disk
version = 0x0
__ttl = 0x1
__tod = 0x6397c5c8 0x2a0f67ca
Dec 12 2022 16:22:32.705775749 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@1
dev_name = /dev/rdsk/c4t1d0
driver_name = lofi
instance = 1
class = resource.sysevent.EC_dev_remove.disk
version = 0x0
__ttl = 0x1
__tod = 0x6397c5c8 0x2a114885
Dec 12 2022 16:22:32.705888134 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@1
dev_name = /dev/rdsk/c4t1d0
driver_name = lofi
instance = 1
class = resource.sysevent.EC_dev_remove.disk
version = 0x0
__ttl = 0x1
__tod = 0x6397c5c8 0x2a12ff86
Jul 20 2023 15:49:37.005335312 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@1
dev_name = /dev/rlofi/1
driver_name = lofi
instance = 1
prop-ddi-kernel-ioctl = 1
prop-zone = global
prop-instance = 1
class = resource.sysevent.EC_dev_add.lofi
version = 0x0
__ttl = 0x1
__tod = 0x64b9ba01 0x516910
Jul 20 2023 16:18:58.388316169 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@1
dev_name = /dev/rlofi/1
driver_name = lofi
instance = 1
class = resource.sysevent.EC_dev_remove.lofi
version = 0x0
__ttl = 0x1
__tod = 0x64b9c0e2 0x17253c09
Jul 20 2023 16:19:00.005279457 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@1
dev_name = /dev/rlofi/1
driver_name = lofi
instance = 1
prop-ddi-kernel-ioctl = 1
prop-zone = global
prop-instance = 1
class = resource.sysevent.EC_dev_add.lofi
version = 0x0
__ttl = 0x1
__tod = 0x64b9c0e4 0x508ee1
Jul 20 2023 16:19:28.004701940 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@2
dev_name = /dev/rdsk/c4t2d0
driver_name = lofi
instance = 2
prop-ddi-kernel-ioctl = 1
prop-zone = global
prop-labeled = 1
prop-instance = 2
prop-device-nblocks = 9830400
prop-device-blksize = 512
prop-device-solid-state = 0
prop-device-rotational = 0
class = resource.sysevent.EC_dev_add.disk
version = 0x0
__ttl = 0x1
__tod = 0x64b9c100 0x47bef4
Jul 20 2023 16:19:28.325764145 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@2
dev_name = /dev/rdsk/c4t2d0
driver_name = lofi
instance = 2
class = resource.sysevent.EC_dev_remove.disk
version = 0x0
__ttl = 0x1
__tod = 0x64b9c100 0x136ac431
Jul 20 2023 16:19:28.326938797 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@2
dev_name = /dev/rdsk/c4t2d0
driver_name = lofi
instance = 2
class = resource.sysevent.EC_dev_remove.disk
version = 0x0
__ttl = 0x1
__tod = 0x64b9c100 0x137cb0ad
Jul 20 2023 16:19:30.008251078 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@2
dev_name = /dev/rdsk/c4t2d0
driver_name = lofi
instance = 2
prop-ddi-kernel-ioctl = 1
prop-zone = global
prop-labeled = 1
prop-instance = 2
prop-device-nblocks = 9830400
prop-device-blksize = 512
prop-device-solid-state = 0
prop-device-rotational = 0
class = resource.sysevent.EC_dev_add.disk
version = 0x0
__ttl = 0x1
__tod = 0x64b9c102 0x7de6c6
Jul 20 2023 16:19:30.008296120 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@2
dev_name = /dev/rdsk/c4t2d0
driver_name = lofi
instance = 2
prop-ddi-kernel-ioctl = 1
prop-zone = global
prop-labeled = 1
prop-instance = 2
prop-device-nblocks = 9830400
prop-device-blksize = 512
prop-device-solid-state = 0
prop-device-rotational = 0
class = resource.sysevent.EC_dev_add.disk
version = 0x0
__ttl = 0x1
__tod = 0x64b9c102 0x7e96b8
Jul 20 2023 16:19:30.008328969 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@2
dev_name = /dev/rdsk/c4t2d0
driver_name = lofi
instance = 2
prop-ddi-kernel-ioctl = 1
prop-zone = global
prop-labeled = 1
prop-instance = 2
prop-device-nblocks = 9830400
prop-device-blksize = 512
prop-device-solid-state = 0
prop-device-rotational = 0
class = resource.sysevent.EC_dev_add.disk
version = 0x0
__ttl = 0x1
__tod = 0x64b9c102 0x7f1709
Jul 20 2023 16:19:30.008349781 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@2
dev_name = /dev/rdsk/c4t2d0
driver_name = lofi
instance = 2
prop-ddi-kernel-ioctl = 1
prop-zone = global
prop-labeled = 1
prop-instance = 2
prop-device-nblocks = 9830400
prop-device-blksize = 512
prop-device-solid-state = 0
prop-device-rotational = 0
class = resource.sysevent.EC_dev_add.disk
version = 0x0
__ttl = 0x1
__tod = 0x64b9c102 0x7f6855
Jul 20 2023 16:51:06.287388297 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@1
dev_name = /dev/rlofi/1
driver_name = lofi
instance = 1
class = resource.sysevent.EC_dev_remove.lofi
version = 0x0
__ttl = 0x1
__tod = 0x64b9c86a 0x11213289
Jul 20 2023 16:51:07.146319384 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@1
dev_name = /dev/rlofi/1
driver_name = lofi
instance = 1
class = resource.sysevent.EC_dev_remove.lofi
version = 0x0
__ttl = 0x1
__tod = 0x64b9c86b 0x8b8a818
Jul 20 2023 16:51:09.005261588 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@1
dev_name = /dev/rlofi/1
driver_name = lofi
instance = 1
prop-ddi-kernel-ioctl = 1
prop-zone = global
prop-instance = 1
class = resource.sysevent.EC_dev_add.lofi
version = 0x0
__ttl = 0x1
__tod = 0x64b9c86d 0x504914
Jul 20 2023 16:51:55.934344624 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@1
dev_name = /dev/rlofi/1
driver_name = lofi
instance = 1
class = resource.sysevent.EC_dev_remove.lofi
version = 0x0
__ttl = 0x1
__tod = 0x64b9c89b 0x37b0f7b0
Jul 20 2023 16:51:57.005890715 (absent)
nvlist version: 0
phys_path = /pseudo/lofi@1
dev_name = /dev/rlofi/1
driver_name = lofi
instance = 1
prop-ddi-kernel-ioctl = 1
prop-zone = global
prop-instance = 1
class = resource.sysevent.EC_dev_add.lofi
version = 0x0
__ttl = 0x1
__tod = 0x64b9c89d 0x59e29b
On a call to lofiadm
that was hanging, we would see this in mdb
> 0t16130::pid2proc | ::walk thread | ::findstack -v
stack pointer for thread fffffe2789251020 (lofiadm/1): fffffcc28a1113c0
[ fffffcc28a1113c0 _resume_from_idle+0x12b() ]
fffffcc28a1113f0 swtch+0x139()
fffffcc28a111470 cv_timedwait_hires+0xd7(fffffffffbcc15d8, fffffffffbcc15d0, 1bf08eb000, f4240, 0)
fffffcc28a1114c0 cv_timedwait+0x52(fffffffffbcc15d8, fffffffffbcc15d0, f63d817)
fffffcc28a111620 lofi_copy_devpath+0xfb(fffffe27b0f5b500)
fffffcc28a1116b0 lofi_get_info+0x218(7a00000000, 804155c, 4c4605, fffffe29d6682cb0, 100401)
fffffcc28a111c80 lofi_ioctl+0x7de(7a00000000, 4c4605, 804155c, 100401, fffffe29d6682cb0, fffffcc28a111dd8)
fffffcc28a111cc0 cdev_ioctl+0x3f(7a00000000, 4c4605, 804155c, 100401, fffffe29d6682cb0, fffffcc28a111dd8)
fffffcc28a111d10 spec_ioctl+0x55(fffffe2634e2ce40, 4c4605, 804155c, 100401, fffffe29d6682cb0, fffffcc28a111dd8, 0)
fffffcc28a111da0 fop_ioctl+0x40(fffffe2634e2ce40, 4c4605, 804155c, 100401, fffffe29d6682cb0, fffffcc28a111dd8, 0)
fffffcc28a111ec0 ioctl+0x144(3, 4c4605, 804155c)
fffffcc28a111f10 sys_syscall32+0x105()
After building and installing an ONU with the following commit reverted, the image build process ran to completion without issue.
4b2d0200492d3957fe8ea5057f7fac75b4aa21d0
5688 blkdev and lofi need to properly init cmlb minors
In case it's relevant at all, I saw this issue on heliosv1 (as dunkin had not yet been updated yet).
I have made a small adjustment to illumos/image-builder to more closely align what it does with what the lofiadm
command does: https://github.com/illumos/image-builder/commit/3ff1e02ac8cb8de0d0fc07ed51f78e914916f39e
If you gmake setup
in the image/
subdirectory it should pull this update in and rebuild.
If you do those things, and pain persists, let me know! I haven't seen the issue again, so far, since doing that...
This may well still be happening. I believe @jordanhendricks saw it again on her machine yesterday.
Indeed I did. My machine has version helios-2.0.22094
, and helios-engvm is at dfa3f8b6
.
I saw this on helios-engvm at cfa1eeba940d85efdaf881a53e56288c5df42b4b
I've seen this on two machines. When running
image.sh
, the script will sometimes hang here:The temporary zpool is in a sad state:
I was able to work around this as follows. First with a
zpool export -f <temp pool name>
(which will hang) and azpool clear
of the pool until the export finishes, then deleting the device inlofiadm
.After that, I was able to run the script again and successfully create an image.