Open spamwax opened 5 years ago
Can you please turn on keepsyms
so we can see what is going on. If you want to boot without ZFS, just use https://openzfsonosx.org/wiki/Boot_loop
1.8.2 will announce as 1.8.1.
not familiar how to do keepsyms
! do I need to build zfs from source or this is something related to macOS?
If there is a guide on how to do it, can you please link it?
I boot into safe/single user mode and use scripts provided by installer to fully remove zfs so I can boot into my Mac.
Ah sorry, run sudo nvram boot-args then add keepsyms=1 to it, like
sudo nvram boot-args="keepsyms=1".
More clearly specified https://openzfsonosx.org/wiki/Install#Initial_installation_from_source but you don't need "-v" unless you like it printing text while booting (like real hackers!) Just check what it is set to first, so you don't lose any setting you may already have (although, clean macs will have no boot-args set).
You do not need to compile, nor disable SIP. keepsyms just means it puts the function names in the panic report, rather than just the addresses like your report has now.
ok, I'll try this and report. I am using Clover for multibooting, so I am guessing I can add the boot arguments in Clover's starting screen.
side question:
Since I am running a hackintosh, do you know if running sudo nvram boot-args="keepsyms=1"
will have a side effect on the system?
Ah hmm, I have no idea about hackintosh. But I think you can add keepsyms=1 to the clover boot arguments
@lundman Ok, I finally managed to get the symbols in crash reports: reboot 1 & reboot 2
Again, these happened after system successfully boots and I can log-in, but after a few second the reboot happens. At the time of these panics only 1 pool was connected.
Do you want me to try and run the system without any pool connected?
Looks like rottegift had a similar problem at one point: https://github.com/openzfsonosx/zfs/issues/521
although I do not think you are trying to remove the log device.
hmmm... given the timeline of that issue I am guessing a fix is not available yet :)
If there is anything I can do to help with replication/logs please let me know. Otherwise I will go ahead and do a fresh install of entire macOS & see if that fixes my issue as I really need to have access to that pool on my Mac
The 6922 commit was reverted, so it is most likely not what the issue is here. It is not happy with the pool though. Have you tried the usual troubleshooting?
"Import -N" to stop it from mounting, then mount datasets one by one. "import -o readonly=on" to see if you can use it readonly "import -T txg-1" using zdb to find last txg, then try to import the pool one txg earlier.
Fresh macOS will not help with the issue of the pool.
ok, let me try these.
The reason I thought fresh install will help is because everything was just working and I hadn't run a scrub or anything like that on pools. Just installed that security update from Apple and everything went south!
Do I need to run import -N
after installing o3x or I try to run that as soon as I log in before the reboot happens?
I did have problem with that security update myself, and possibly it crashed and your pool is now in a bad state.
After installing ZFS, you should disable the automatic import on boot, which is a launchctl script that runs /usr/local/libexec/zfs/launchd.d/zpool-import-all.sh
So either unload the launchctl plist, or rename the zpool-import-all.sh script out of the way, that stops it from automatically importing it.
Then you can try various things - and yes, before reboot is ok after install.
I disabled the launchctl script and tried all of the suggested troubleshooting steps.
both import -N
& import -o readonly=on
caused a panic after about 30 seconds while the command (sudo zpool import -N pool
) hadn't finished & returned.
I then got txg
number by running sudo zdb -l /dev/rdisk0s1
and used it in import -T
, and same thing happened.
Should I try to import with lower txg
numbers, say, txg-2
?
How did you manage the issue with that security update?
Yes you can try TXG-2 and maybe as high as -10. But use it with readonly so you don't make changes to the pool while trying rolling back.
With security update, I had to use apfs snapshot to rollback before the update, then do the update again. The pool was just a test pool, so I had no issues destroying it
going down to -10 didn't work.
I can attached my pool to a Linux machine, is there anything I should know in order to try & fix the pool in the new machine?
No? Don't run "zpool upgrade" on Linux or you can't import on OSX again.
@lundman So I could successfully mount the pool on Linux in readonly mode.
However when I do zdb -l /dev/sdb
, I get failed to unpack label 0
through label 3
Do I need to do anything in Linux before I try to attach the pool back to macOS?
Linux is Manjaro with zfs package version of 0.7.13-1
zpool status
shows no error
and I can browse the pool in readonly mode
that is good news - so at the very least you can get data off it in an emergency. It could be that if you import it fully and export, it will write the labels properly, and work again on osx?
ok, so I just did the export/import in a non-readonly in Linux.
zdb -c
didn't return any glaring issue.
However txg
number I am getting in Linux are different than Mac 5987721
vs 5987652
I am guessing it has to do with importing/exporting the pool in Linux, maybe?
Should I apply the Apple's security patch before trying to use the pool in Mac?
The txg will tick upwards when its imported. A txg sync is when there is enough data, or changes, or sufficient seconds, since last txg - then rolls one more.
Security fix before or not is up to you, but I would export all pools before starting it.
ok, thanks for all your help. I'll try the mac & report back.
Just curious why macOS was unable to import the pool! Is it OS related or zfs difference on those platforms.
we are struggling to catch up with ZOLs high turn out of commits, perhaps there is something in vdev coming up
hmmm, I wish I was versed enough in magics of zfs to help :)
@lundman Unfortunately the same panic happens after I moved the pool from Linux.
While the problematic pool is attached to macOS, quickly creating a pool on a USB & trying to import/export that specific pool takes a really long time!
Since I can attached this to Linux, what's the best way of cloning the pool before I destroy it?
Should I just do the typical zfs send
-> zpool destroy
-> zpool create
-> zfs receive
?
making a recursive snapshot (-r
) and then using zfs send
with the -R
option (I would suggest putting -ceL
in the mix as well to speed up the process) would be a good way, yes.
Should I just do the typical
zfs send
->zpool destroy
->zpool create
->zfs receive
?
This one I didn't understand. I'd create the new pool, zfs send | zfs recv
from the old pool into the new pool, and then only after it successfully sent the pool's data would I destroy the old pool.
Thanks for hints, I didn't know about -ceL
Since I didn't have access to a large enough pool, I had to dump the zfs send
's stream to a file, destroy/re-create the original one and then zfs receive
from the dumped file.
I can now import the pool in macOS but only in readonly mode:
This pool uses the following feature(s) not supported by this system:
org.zfsonlinux:userobj_accounting
All unsupported features are only required for writing to the pool.
The pool can be imported using '-o readonly=on'.
cannot import 'Virtuals': unsupported version or feature
I am guessing this happened since the pool was created on Linux. I am gonna try creating the pool on macOS and then doing a zfs recv
over network from the dumped file.
Yeah, you can also create on linux with "-g" "zpool create -g" which disables all features, then add features back that work on both platforms (so everything except userobj_accounting)
I couldn't find -g
option in man
pages for zpool
, did you mean -d
?
How do I get a comprehensive list of features? I don't want to copy/paste names from man zpool-features
:)
Hah yes, I did mean "-d". You can create a pool, then run update -v to list which is used.
ok, thanks. So I only need to re-enable the new features from the output of update -v
command & not the (longer list) of "legacy versions"?
The numbers? No, the version is 5000 now and no reason to great an ancient pool.
Closing the issue since I could import the Linux-created pool back in macOS using above suggestions. Thanks again.
@lundman Unfortunately, enabling automatic import of zpool at startup caused the same corruption & kernel panic. I should mention that before enabling automatic import, I could reliably import/export my pools.
Can I have a panic dump with symbols, in case there is some new hints
That is a little different, but most likely related. So you created a new pool with ZOL, and it was working, but will now panic on import?
Can you share "diskutil list" output? I wonder if it is related to another issue where USB stick is identified as FAT and Apple pukes on it.
This is the output of diskutil
After the original panics, I created a zpool
in Linux and used the backup I had to restore the data. Then I moved the pool to macOS where auto import of pools on startup was disabled.
I could use the pool by manually importing/exporting it.
Then I enabled the auto import & restarted the machine, the panic happened.
I was using 1.7.2 and everything was working just fine. Then I used App Store to apply Apple's security update (2019-01) and the panic started to happen.
On first restart, I got the black screen message saying that computer had to be rebooted due to an error, after which I can get to login greeting. However after a short period of time (15-45s) the reboot happens and I get the black screen message again.
So I decided to upgrade to 1.8.1 and still save issue! Issue persisted even when I turned off all HDDs in the pools.
I used my TimeMachine backup to restore the system to the state before Apple's security update hoping it will fix the issue, which it didn't :( So at this point I'm stuck!
The kernel panic log as a gist is here.
Not sure if this is relevant, but I installed 1.8.2 however the above log shows some
1.8.1
versions fornet.lundman.spl
andnet.lundman.zfs
System info: Model Name: iMac Model Identifier: iMac14,2 Processor Speed: 3.70 GHz Number of Processors: 1 Total Number of Cores: 6 L2 Cache (per Core): 256 KB L3 Cache: 12 MB Memory: 32 GB
System Version: macOS 10.13.6 (17G65) Kernel Version: Darwin 17.7.0 Boot Volume: macOS Boot Mode: Normal Secure Virtual Memory: Enabled System Integrity Protection: Enabled Time since boot: 19 minutes