Closed EdwardMoyse closed 2 months ago
In case it is relevant, I was compiling in a separate APFS (Case-sensitive) Volume as described here. This volume seems absolutely fine - so the corruption seems limited to the VM itself. I can't see how this could have happened with 100GB but I wonder if it's possible that it ran out of space? I could try increasing the disk size, but the whole point of using an external volume was that this would not be necessary.
--vm-type=vz
? dmesg
?Hmm. I just tried again but compiling in /tmp
rather than the case-sensitive volume, and this worked fine. A colleague has confirmed a similar experience - problems with /Volumes/Lima
, but works fine in /tmp
. So my best guess right now is it is some interaction with an APFS Volume and Lima (which might also explain the following "stuck VM" discussion : https://github.com/lima-vm/lima/discussions/1666)
Answering your other questions:
qemu
but this is so slow that this is very hard to do. I will try.
I will also try again with dmesg running.You would get much better performance with a local filesystem, as well. If you want to keep it separate from the OS image, you could add a native Lima image using the limactl disk
command. And copy the results, when the build is done.
EDIT: One potential feature could be to be able create disk images on an attached disk, instead of under LIMA_HOME.
You can probably use symlinks from _disks
as a workaround, but would be better with some optional flag support...
If you really need to access 100 GiB from the host, then we might have to add some more "boring" type like NFS... It seems like sftpd and virtiofsd, and also their Linux clients, have some stability issues when put under pressure?
EDIT: One potential feature could be to be able create disk images on an attached disk, instead of under LIMA_HOME. You can probably use symlinks from _disks as a workaround, but would be better with some optional flag support...
This would, I think, really help us.
Our use-case is this - we want to be able to edit files from within macOS, but then compile inside Almalinux9. The codebase we are compiling is relatively large (>4 million lines of C++) and can take up to 400 GB of temporary compilation space. I was reluctant to make separate VMs with this much local storage, especially since a lot of us will be working on laptops. Ideally we would have a large build area (possibly on an external drive), accessible from several VMs, and with very fast disk io to the VM (since otherwise the build time can become unusably slow). We do NOT, in general, need to be able to access this build area from the host (at least, not with fast io - it would mainly be to examine compilation failures)
(I will get back to the other tests shortly - but I'm currently travelling with limited work time, and it seems very likely that the issue is related to compiling outside the VM)
(I will get back to the other tests shortly - but I'm currently travelling with limited work time, and it seems very likely that the issue is related to compiling outside the VM)
I'm not sure how virtiofs affects the XFS disk, but maybe this issue should be reported to Apple?
I was under the impression that the problem was with the /Volumes/Lima
mount, but the logs say vda2
...
- location: /Volumes/Lima
writable: true
So the remote filesystem is a separate topic*, from this ARM64 disk corruption. Sorry for the added noise.
Though I don't see how switching from remote /Volumes/Lima
to local /tmp
could have helped, then...
* should continue in a different discussion
Note that disk images cannot be shared... (they can be unplugged and remounted)
Is this relevant?
(UTM uses vz too)
Looks like people began to hit this issue since September, so I wonder if Apple introduced a regression on that time?
I still can't repro the issue locally though. (macOS 14.1 on Intel MacBookPro 2020, macOS 13.5.2 on EC2 mac2-m2pro)
Can anybody confirm this rumor?
https://github.com/utmapp/UTM/issues/4840#issuecomment-1764436352
Is it me or deactivating ballooning solves the problem? I've deactivated it two weeks ago, and no problem since on my side.
Removing these lines will disable ballooning: https://github.com/lima-vm/lima/blob/7cb2b2e66215dd5f0aac280375645eec67550db4/pkg/vz/vm_darwin.go#L598-L604
For what it's worth, I believe I've narrowed down the problem that I've noticed in https://github.com/utmapp/UTM/issues/4840 to having used an external SSD drive. I've not reproduced the corruption if the VM lives on my Mac's internal storage.
@EdwardMoyse Your separate APFS volume... is it on the same storage device that your Mac runs on, or is it a separate external device?
@AkihiroSuda I've not seen disabling the Balloon device to help with preventing corruption. At least, if I'm working with a QEMU-based VM that lives on my external SSD storage, it has Balloon Device
un-checked by default, and the VM's filesystem will eventually corrupt under heavy disk load. So I believe this is a red herring.
I'm working with a QEMU-based VM
Probably, you are hitting a different issue with a similar symptom ?
@wdormann my APFS Volume is on same device (SSD) as macOS. It's not an external device in my case.
Thanks for the input. I've been testing the disk itself, and it has yet to report errors. Given your successful test in /tmp, these both seem to point to a problem using a non-OS volume for the underlying VM OS storage?
I think I reproduced the issue with the default Ubuntu template:
[ 299.527200] EXT4-fs error (device vda1): ext4_lookup:1851: inode #3793: comm apport: iget: checksum invalid
[ 299.527255] Aborting journal on device vda1-8.
[ 299.527293] EXT4-fs error (device vda1): ext4_journal_check_start:83: comm cp: Detected aborted journal
[ 299.528985] EXT4-fs error (device vda1): ext4_journal_check_start:83: comm rs:main Q:Reg: Detected aborted journal
[ 299.530464] EXT4-fs (vda1): Remounting filesystem read-only
[ 299.530515] EXT4-fs error (device vda1): ext4_lookup:1851: inode #3794: comm apport: iget: checksum invalid
[ 299.535137] EXT4-fs error (device vda1): ext4_lookup:1851: inode #3795: comm apport: iget: checksum invalid
[ 299.538878] EXT4-fs error (device vda1): ext4_lookup:1851: inode #3796: comm apport: iget: checksum invalid
[ 299.543827] EXT4-fs error (device vda1): ext4_lookup:1851: inode #3797: comm apport: iget: checksum invalid
[ 299.550614] EXT4-fs error (device vda1): ext4_lookup:1851: inode #3798: comm apport: iget: checksum invalid
[ 299.551947] EXT4-fs error (device vda1): ext4_lookup:1851: inode #3799: comm apport: iget: checksum invalid
[ 299.553651] EXT4-fs error (device vda1): ext4_lookup:1851: inode #3800: comm apport: iget: checksum invalid
[ 299.821872] audit: type=1131 audit(1698675832.913:271): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=unconfined msg='unit=systemd-journald comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
[ 299.821967] BUG: Bad rss-counter state mm:0000000013fa5858 type:MM_FILEPAGES val:43
[ 299.821980] BUG: Bad rss-counter state mm:0000000013fa5858 type:MM_ANONPAGES val:3
[ 299.821982] BUG: non-zero pgtables_bytes on freeing mm: 4096
[ 299.822551] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000070
[ 299.822566] Mem abort info:
[ 299.822566] ESR = 0x0000000096000004
[ 299.822568] EC = 0x25: DABT (current EL), IL = 32 bits
[ 299.822569] SET = 0, FnV = 0
[ 299.822570] EA = 0, S1PTW = 0
[ 299.822570] FSC = 0x04: level 0 translation fault
[ 299.822571] Data abort info:
[ 299.822572] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[ 299.822573] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 299.822574] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 299.822575] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000100970000
[ 299.822576] [0000000000000070] pgd=0000000000000000, p4d=0000000000000000
[ 299.822604] Internal error: Oops: 0000000096000004 [#1] SMP
[ 299.822615] Modules linked in: tls nft_chain_nat overlay xt_tcpudp xt_nat xt_multiport xt_mark xt_conntrack xt_comment xt_addrtype xt_MASQUERADE nf_tables nfnetlink ip6table_filter iptable_filter ip6table_nat iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6_tables veth bridge stp llc tap isofs binfmt_misc nls_iso8859_1 vmw_vsock_virtio_transport vmw_vsock_virtio_transport_common vsock virtiofs joydev input_leds drm
[ 299.822800] Unable to handle kernel paging request at virtual address fffffffffffffff8
[ 299.822805] Mem abort info:
[ 299.822805] ESR = 0x0000000096000004
[ 299.822806] EC = 0x25: DABT (current EL), IL = 32 bits
[ 299.822807] SET = 0, FnV = 0
[ 299.822808] EA = 0, S1PTW = 0
[ 299.822809] FSC = 0x04: level 0 translation fault
[ 299.822810] Data abort info:
[ 299.822810] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[ 299.822811] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 299.822812] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 299.822813] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000864e50000
[ 299.822814] [fffffffffffffff8] pgd=0000000000000000, p4d=0000000000000000
[ 361.102020] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 361.102094] rcu: 1-...0: (1 GPs behind) idle=e0b4/1/0x4000000000000000 softirq=23608/23609 fqs=6997
[ 361.102102] rcu: hardirqs softirqs csw/system
[ 361.102103] rcu: number: 0 0 0
[ 361.102104] rcu: cputime: 0 0 0 ==> 30000(ms)
[ 361.102105] rcu: (detected by 3, t=15002 jiffies, g=38213, q=860 ncpus=4)
[ 361.102107] Task dump for CPU 1:
[ 361.102108] task:systemd state:S stack:0 pid:1 ppid:0 flags:0x00000002
[ 361.102111] Call trace:
[ 361.102118] __switch_to+0xc0/0x108
[ 361.102180] seccomp_filter_release+0x40/0x78
[ 361.102203] release_task+0xf0/0x238
[ 361.102216] wait_task_zombie+0x124/0x5c8
[ 361.102218] wait_consider_task+0x244/0x3c0
[ 361.102220] do_wait+0x178/0x338
[ 361.102222] kernel_waitid+0x100/0x1e8
[ 361.102224] __do_sys_waitid+0x2bc/0x378
[ 361.102226] __arm64_sys_waitid+0x34/0x60
[ 361.102228] invoke_syscall+0x7c/0x128
[ 361.102230] el0_svc_common.constprop.0+0x5c/0x168
[ 361.102231] do_el0_svc+0x38/0x68
[ 361.102232] el0_svc+0x30/0xe0
[ 361.102234] el0t_64_sync_handler+0x148/0x158
[ 361.102236] el0t_64_sync+0x1b0/0x1b8
[ 541.118359] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 541.118368] rcu: 1-...0: (1 GPs behind) idle=e0b4/1/0x4000000000000000 softirq=23608/23609 fqs=27191
[ 541.118371] rcu: hardirqs softirqs csw/system
[ 541.118372] rcu: number: 0 0 0
[ 541.118373] rcu: cputime: 0 0 0 ==> 210020(ms)
[ 541.118375] rcu: (detected by 3, t=60007 jiffies, g=38213, q=1790 ncpus=4)
[ 541.118377] Task dump for CPU 1:
[ 541.118379] task:systemd state:S stack:0 pid:1 ppid:0 flags:0x00000002
[ 541.118382] Call trace:
[ 541.118383] __switch_to+0xc0/0x108
[ 541.118390] seccomp_filter_release+0x40/0x78
[ 541.118393] release_task+0xf0/0x238
[ 541.118396] wait_task_zombie+0x124/0x5c8
[ 541.118399] wait_consider_task+0x244/0x3c0
[ 541.118401] do_wait+0x178/0x338
[ 541.118403] kernel_waitid+0x100/0x1e8
[ 541.118405] __do_sys_waitid+0x2bc/0x378
[ 541.118407] __arm64_sys_waitid+0x34/0x60
[ 541.118409] invoke_syscall+0x7c/0x128
[ 541.118411] el0_svc_common.constprop.0+0x5c/0x168
[ 541.118412] do_el0_svc+0x38/0x68
[ 541.118413] el0_svc+0x30/0xe0
[ 541.118415] el0t_64_sync_handler+0x148/0x158
[ 541.118417] el0t_64_sync+0x1b0/0x1b8
(Non-minimum, non-deterministic) repro steps:
Create a mac2-m2pro
(32GB RAM) instance on EC2, with macOS 13.5.2 AMI, and a gp2
EBS volume
Install Lima v0.18.0
Run limactl start --vm-type=vz --cpus=4 --memory=32 --disk=100 --name=vm1
Run limactl start --vm-type=vz --cpus=4 --memory=32 --disk=100 --name=vm2
For each of the VMs, run cp -a /Users/ec2-user/some-large-directory ~
.
Some of them may fail with cp: ...: Read-only filesystem
% mount
/dev/disk5s2s1 on / (apfs, sealed, local, read-only, journaled)
devfs on /dev (devfs, local, nobrowse)
/dev/disk5s5 on /System/Volumes/VM (apfs, local, noexec, journaled, noatime, nobrowse)
/dev/disk5s3 on /System/Volumes/Preboot (apfs, local, journaled, nobrowse)
/dev/disk1s2 on /System/Volumes/xarts (apfs, local, noexec, journaled, noatime, nobrowse)
/dev/disk1s1 on /System/Volumes/iSCPreboot (apfs, local, journaled, nobrowse)
/dev/disk1s3 on /System/Volumes/Hardware (apfs, local, journaled, nobrowse)
/dev/disk5s1 on /System/Volumes/Data (apfs, local, journaled, nobrowse)
map auto_home on /System/Volumes/Data/home (autofs, automounted, nobrowse)
/dev/disk3s4 on /private/tmp/tmp-mount-mDoJ7V (apfs, local, journaled, nobrowse)
% stat -f %Sd / disk5s1
% stat -f %Sd /Users/ec2-user/.lima
disk5s1
The VM disk is located in the default path `~/.lima`.
Tried to remove the balloon, but the filesystem still seems to break intermittently
[ 1674.027587] EXT4-fs error (device vda1): ext4_lookup:1851: inode #35601: comm apport: iget: checksum invalid
[ 1674.030317] Aborting journal on device vda1-8.
[ 1674.031818] EXT4-fs error (device vda1): ext4_journal_check_start:83: comm rs:main Q:Reg: Detected aborted journal
[ 1674.031896] EXT4-fs error (device vda1): ext4_journal_check_start:83: comm systemd-journal: Detected aborted journal
[ 1674.033116] EXT4-fs (vda1): Remounting filesystem read-only
[ 1674.033147] EXT4-fs error (device vda1): ext4_lookup:1851: inode #35602: comm apport: iget: checksum invalid
[ 1674.036501] EXT4-fs error (device vda1): ext4_lookup:1851: inode #35603: comm apport: iget: checksum invalid
[ 1674.037738] EXT4-fs error (device vda1): ext4_lookup:1851: inode #35604: comm apport: iget: checksum invalid
[ 1674.038828] EXT4-fs error (device vda1): ext4_lookup:1851: inode #35605: comm apport: iget: checksum invalid
[ 1674.040034] EXT4-fs error (device vda1): ext4_lookup:1851: inode #35606: comm apport: iget: checksum invalid
[ 1674.041091] EXT4-fs error (device vda1): ext4_lookup:1851: inode #35606: comm apport: iget: checksum invalid
[ 1674.042199] EXT4-fs error (device vda1): ext4_lookup:1851: inode #35606: comm apport: iget: checksum invalid
Thanks for the input. I've been testing the disk itself, and it has yet to report errors. Given your successful test in /tmp, these both seem to point to a problem using a non-OS volume for the underlying VM OS storage?
I might be perhaps misunderstanding you, but I don't think I am using "non-OS volume for the underlying VM OS storage".
For clarity, here is my setup:
limactl start almalinux9.yaml --name=alma9
, and the VM exists on the main macOS volume.
/Volumes/Lima
I get disk corruption, if I use /tmp
for the same operation, it works fine.So I would characterise this as rathe ra problem using a non-OS volume for the intensive disk operations from within the VM.
I'll admit I'm not familiar with Lima. When you say "make it mountable from within the VM", what does that mean?
Perhaps Lima does this all for you under the hood, but I suppose that I'd need to know exactly what it's doing to have any hope of understanding what's going on.
I'll admit I'm not familiar with Lima. When you say "make it mountable from within the VM", what does that mean?
- You have a virtual hard disk file that lives on that separate APFS volume, and your VM is configured to have that as a second disk drive?
- You boot the VM, and somehow from Linux user/kernel land mount your /Volumes/Lima directory? (How?)
It's the latter (but I cannot tell you any technicalities how it works). From within both the host and the VM I can access /Volumes/Lima
. See https://lima-vm.io/docs/config/mount/
Do you specify a mount type in your limactl
command line and/or config file?
Or, from the VM, what does the mount
command report for the filesystem in question?
It is still a mystery how the problems from a remote filesystem, can "spread" to cause I/O errors on a local filesystem...
Points to a bug with the hypervisor, or even the host OS and CPU arch? Unless it turns out to be an EL9 guest issue, not seen on x86_64 but only on aarch64
The documentation says that all filesystem types other than reverse-sshfs are "experimental".
@afbjorklund Your earlier comment suggested that /dev/vda
(Virtiofs) was how it was being mounted.
Perhaps those looking for a temporary workaround could try using reverse-sshfs instead?
virtiofs doesn't seem relevant. UTM users seem hitting the same issue without using virtiofs: https://github.com/utmapp/UTM/issues/4840
The issue seems also reproducible with Apple's example: https://github.com/utmapp/UTM/issues/4840#issuecomment-1786762407
Moreover: I have tried this example Xcode project (https://developer.apple.com/documentation/virtualization/running_gui_linux_in_a_virtual_machine_on_a_mac) and has the same issues. It's pretty clear to me that the issues is not UTM-related but Apple + Linux related, but I haven't found any other discussion forum. Moreover, the UTM community may be more successful should they raise this issue to the Linux kernel team or Apple.
@afbjorklund Your earlier comment suggested that
/dev/vda
(Virtiofs) was how it was being mounted.
The screenshot of a log above, showed "XFS (vda2)" as the device in question - so not the virtiofs mount? It was showing I/O errors in both /usr/include and /bin/ls, those are not mounted and not on /Volumes/Lima
It is using virtio (otherwise it would be called sda2), but it is not using virtiofs (the remote filesystem) https://virtio-fs.gitlab.io/ (the names here are somewhat confusing, there is also virtfs - which is called 9p)
So this issue is about a recent problem on Apple.
Then we can have a different discussion about building on network filesystems, instead of on local filesystems.
I was just curious about the comment about moving the build to /tmp
seems to have "cured" the corruption...
https://github.com/lima-vm/lima/issues/1957#issuecomment-1784120488
Ha, 6.5.0? That one in particular is completely broken. Needs this patch. If your package doesn't have it backported, there's your problem.
6.4 should be fine, as should 6.5.6 according to the changelog.
(Our Asahi tree is currently on 6.5.0 with that patch cherry-picked. And yes, that is the second time ARM64 atomics got broken!)
Originally posted by [@]marcan in https://github.com/utmapp/UTM/issues/4840#issuecomment-1790843588
Can anybody try kernel 6.6? (Just released 3 days ago).
As far as I know, AlmaLinux 9.2 is running kernel 5.14 : https://wiki.almalinux.org/release-notes/9.2.html
ARM64 atomics have been broken until last year, when I found the issue and got it fixed (it was breaking workqueues which was causing problems with TTYs for me, but who knows what else). 5.14 (released 2021) is definitely broken unless it's a branch with all the required backports.
Try 6.4, that should work. 6.5.0 was a very recent regression. I would not put much faith in older kernels, especially anything older than 5.18 which is where we started. All bets are off if you're running kernels that old on bleeding edge hardware like this. Lots of bugfixes don't get properly backported into stable branches either. Apple CPUs are excellent at triggering all kinds of nasty memory ordering bugs that no other CPUs do, because they speculate/reorder across ridiculous numbers of instructions and even things like IRQs (yes really).
So that means qemu only, unless running Fedora* ? Seems like Virtualization.framework exposes more of the CPU
* or Ubuntu 23.10 <-- needs backport
Probably should get the automatic updates in place, since otherwise Fedora 38 will run 6.2.9 until user remembers*...
* to upgrade to 6.5.8
I was just curious about the comment about moving the build to
/tmp
seems to have "cured" the corruption...
Hey @afbjorklund I've been running some more tests, and I just had corruption from /tmp
so it doesn't cure it (but perhaps it is slightly less likely to happen). Updating the original post.
My apologies for the delay in replying, but i have been looking into this. The workflow is the same - compile https://gitlab.cern.ch/atlas/atlasexternals using the attached template with various configurations of host, qemu/vz, cores and memory.
TLDR; updating to 6.5.10-1
was more stable on M2 (even on 'shared' volume /tmp/lima
), but apparently worse on M1 Pro (though the M1Pro has more cores and we pushed this a lot harder). Updating to 6.6.1
was better on M1 Pro (have not tested M2 yet) but got xfs corruption at the very end.
With 6.6.1
I also disabled sleeping on guest:
sudo systemctl mask sleep.target suspend.target hibernate.target hybrid-sleep.target
(from hint here)
VM Type | Kernel | Cores | RM (GB) | Where | Attempt 1 | Attempt 2 | Attempt 3 | Host Processor |
---|---|---|---|---|---|---|---|---|
qemu | 5.14 | 6 | 24 | /tmp | Crash + xfs | Crash + xfs | Crash + xfs | M1 Pro |
vz | 5.14 | 6 | 24 | /Volumes/Lima | Crash + xfs | M1 Pro | ||
vz | 5.14 | 6 | 24 | /tmp | OK | M1 Pro | ||
qemu | 5.6.10.1 | 6 | 24 | /tmp | OK (but slow) | M1 Pro | ||
vz | 5.6.10.1 | 6 | 24 | /Volumes/Lima | Crash + xfs | M1 Pro | ||
vz | 5.6.10.1 | 6 | 24 | /tmp | Crash a | Crash b | M1 Pro | |
vz | 6.6.1 | 6 | 24 | /tmp | xfs | M1 Pro | ||
vz | 6.6.2-1 | 4 | 12 | /home/emoyse.linux | xfs | M1 Pro |
Notes:
xfs
means xfs corruption was reported./var/log/messages
I see :
978.3062161 BUG: Bad rss-counter state mm:0000000076c5940f type:M_FILEPAGES val: 402
[978.3067761 BUG: Bad rss-counter state mm:0000000076c5940f type:MM_ANONPAGES val:206
978.3071421 BUG: non-zero pgtables_bytes on freeing mm: 69632
[+0.0116951 BUG: Bad rss-counter state mm:0000000076c5940f type:MM FILEPAGES val: 402
Nov 7 16:44:19 lima-myalma92 kernel: BUG: workqueue lockup - pool cpus=5 node=0 flags=0x0 nice=0 stuck for 2164s!
Nov 7 16:44:19 lima-myalma92 kernel: Showing busy workqueues and worker pools:
Nov 7 16:44:19 lima-myalma92 kernel: workqueue events: flags=0x0
Nov 7 16:44:19 lima-myalma92 kernel: pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
Nov 7 16:44:19 lima-myalma92 kernel: pending: drm_fb_helper_damage_work [drm_kms_helper]
Nov 7 16:44:19 lima-myalma92 kernel: workqueue mm_percpu_wq: flags=0x8
Nov 7 16:44:19 lima-myalma92 kernel: pwq 10: cpus=5 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
Nov 7 16:44:19 lima-myalma92 kernel: pending: vmstat_update
[emoyse@lima-alma9661c6 tmp]$ ls
bash: /usr/bin/ls: Input/output error
And in the display I see:
FWIW, I've added some test results and comments here: https://github.com/utmapp/UTM/issues/4840#issuecomment-1816886227
I've not ruled out that there is some issue with the macOS filesystem/hypervisor layer, but I've only seen corruption with a Linux VM, and not macOS or Windows doing the exact same thing, from the exact same VM disk backing. What is interesting to me is that if I take the exact same disk and reformat it as APFS instead of ExFAT, Linux 6.5.6 or 6.4.15 will not experience disk corruption. My theory is that given an unfortunate combination of speed/latency/something-else for disk backing, a Linux VM might experience disk corruption.
My theory is that given an unfortunate combination of speed/latency/something-else for disk backing, a Linux VM might experience disk corruption.
Could you submit your insight to Apple? Probably via https://www.apple.com/feedback/macos.html
I have, just to hedge my bets. However, if Windows, macOS, and I just recently tested FreeBSD, all work flawlessly under the exact same workload, using the same host disk backing, and only Linux has a problem, I'd say that this is a Linux problem. Not Apple.
I can trigger filesystem corruption if my external disk is formatted with ExFAT
Oh, so that might be why it is mostly affecting external disks ? Did people forget to (re-)format them before using ?
EDIT: no, not so simple
"I create a separate APFS (Case-sensitive) Volume,"
I can trigger filesystem corruption if my external disk is formatted with ExFAT
Oh, so that might be why it is mostly affecting external disks ? Did people forget to (re-)format them before using ?
EDIT: no, no so simple
"I create a separate APFS (Case-sensitive) Volume,"
And for me, I'm not using external (to the VM) disks any more - if you look at the table I posted here you will see that in the Where
column, I'm mostly using /tmp
to work in i.e. completely inside the VM. Using an external disk might provoke the corruption earlier, but it's certainly not the only route to it (though later kernels seem quite a bit more stable).
In my case it occurs with internal disk nd very frequent on fedora images. Just create fedora vm and do dnf update, corruption happens immediately. btrfs scrub start /
EDIT: vz in my case
Using an external disk might provoke the corruption earlier, but it's certainly not the only route to it (though later kernels seem quite a bit more stable).
I don't recall if I mentioned it here, but through eliminating variables I was able to pinpoint a configuration for a likely-to-corrupt-older-Linux-kernels situation, and that is having the VM hosted on an ExFAT-formatted partition (which just happens to be on an external disk for me). Based on how macOS/APFS works, I don't think it's even possible for me to test how ExFAT might perform on my internal disk. At least not without major reconfiguration of my system drive.
If others are able to reproduce the disk corruption without relying on ExFAT at the host level, that at least helps eliminate the ExFAT-layer possibility of where the problem lies. At least for me, I've been able to avoid the problem by reformatting my external disk to APFS, as that seems to tweak at least one of the required variables to see this bug happen. At least if the Linux kernel version is new enough.
At a conceptual level, it is indeed possible that Linux may be doing nothing wrong at all. In other words, it could be possible that Linux just happens to be unlucky enough to express the disk usage patterns that can trigger a bug that presents symptoms as a corrupted (BTRFS in my case) file system. But I suspect that being able to positively acknowledge the difference between a somewhat unlikely to see Linux data corruption bug and a bug at the macOS hypervisor / storage level is probably beyond my skill set.
Ok, just to throw a wrench into the works, I did notice my FreeBSD VM eventually experiencing disk corruption, but only after about a day or so of running the stress test. As opposed to the minute or two that it takes for Linux to corrupt itself.
The same VM clone but running from an APFS filesystem seems fine:
So it seems like there are a lot of references to people mentioning issues related to external disks and non-APFS filesystems. I am using the internal disk on my m2 mini with the default APFS filesystem and I've experienced disk corruption once but haven't specifically been able to force it to be reproduced but I haven't tried very hard to be honest but I did want to point out that maybe external disks and other filesystems may not be the specific cause but may just be easier to trigger compared to internal APFS.
I run Debian Bookworm and after repairing the filesystem with a fsck I did also upgrade my kernel from linux-image-cloud-arm64
6.1.55-1
to 6.5.3-1~bpo12+1
in backports.
The above table also lists corrupting when running with qemu/hvf, so it might not even be unique to vz...
It is not unique to vz, and it is not unique to external disks.
With Almalinux 9.2
+ kernel 6.6.2-1
I just got corruption with sudo yum update -y
:-(
Okay, I updated the title and the original comment to hopefully clarify that this is a problem with every conceivable permutation of lima.
Unfortunately for me lima is completely unusable at the moment, and so for the moment I'm giving up.
I can reproduce this with 2 methods: stress-ng --iomix 4
(for filesystems with data checksums) and parallel cp
of big files and then sha256sum *
. Details: https://github.com/utmapp/UTM/issues/4840#issuecomment-1821561359
Are you able to reproduce this as well?
Okay, I updated the title and the original comment to hopefully clarify that this is a problem with every conceivable permutation of lima.
It still seems to be unique to one operating system and one hardware architecture, though? Maybe even Apple's issue.
Okay, I updated the title and the original comment to hopefully clarify that this is a problem with every conceivable permutation of lima.
It still seems to be unique to one operating system and one hardware architecture, though? Maybe even Apple's issue.
Sorry, yes. I was being very single-minded in my statement above! I will rephrase the title.
The above table also lists corrupting when running with qemu/hvf, so it might not even be unique to vz...
This issue might be worth reporting to https://gitlab.com/qemu-project/qemu/-/issues too, if the issue is reproducible with bare QEMU (without using Lima)
At the risk of further fragmentation of the discussion of this issue, but at the potential benefit of getting the right eyeballs, I've filed: https://gitlab.com/qemu-project/qemu/-/issues/1997
(i.e., yes this can be reproduced with QEMU, as opposed to the Apple Hypervisor Framework)
This may fix the issue for vz:
( Thanks to @wpiekutowski https://github.com/utmapp/UTM/issues/4840#issuecomment-1824340975 @wdormann https://github.com/utmapp/UTM/issues/4840#issuecomment-1824542732 )
Oh wow - I've run my test twice with the patched version of lima and no corruption or crashes! From reading the ticket, it's more a workaround than a complete fix, but I'll happily take it! Thanks @AkihiroSuda
Description
Lima version: 0.18.0 macOS: 14.0 (23A344) VM: Almalinux9
I was trying to do a big compile, using a VM with the attached configuration (vz)
The build aborted with:
And afterwards, even in a different terminal, I see:
I was also logged into a display, and there I saw e.g.
If I try to log in again with:
each time I see something like the following appear in the display window:
Edit: there has been a lot of discussion below, and the corruption can happen with both
vz
andqemu
, and on external (to the VM) and internal disks. Some permutations seem more likely to provoke a corruption than others. I have reproduced my experiments in the table in the following comment below.