storaged-project / blivet

A python module for configuration of block devices
GNU Lesser General Public License v2.1
98 stars 85 forks source link

ERROR:blivet:failed to find parent for subvol #1201

Closed keestux closed 6 months ago

keestux commented 6 months ago

The problem I'm describing here is triggered by the Fedora 39 installer. It is also reported in Redhat Bugzilla 2210933. It is suggested (by @vojtechtrefny ) that

Moving to blivet for further investigation. Anaconda might not support working with btrfs snapshots in the UI, but it definitely shouldn't crash. The crash itself comes from blivet, so we need to fix it there. We have some long standing issues to make blivet more stable in situation where we encounter unknown/unsupported setups and this is a nice example of situation where we should ignore the "missing" parent device instead of crashing like this.

I haven't found it reported as an issue here, so that is why I've created this issue.

In my situation I wanted to have a look at Fedora installation on a system that already has two BTRFS file systems. The error I'm seeing is:

Feb 16 14:39:45 localhost-live org.fedoraproject.Anaconda.Modules.Storage[3413]: ERROR:blivet:failed to find parent (660) for subvol @backup-cds/backup-tardis/.snapshots
Feb 16 14:39:45 localhost-live org.fedoraproject.Anaconda.Modules.Storage[3413]: INFO:anaconda.core.threads:Thread Failed: AnaTaskThread-ScanDevicesTask-1 (140710692480704)
Feb 16 14:39:45 localhost-live org.fedoraproject.Anaconda.Modules.Storage[3413]: ERROR:anaconda.modules.common.task.task:Thread AnaTaskThread-ScanDevicesTask-1 has failed: Traceback (most recent call last):
Feb 16 14:39:45 localhost-live org.fedoraproject.Anaconda.Modules.Storage[3413]:   File "/usr/lib64/python3.12/site-packages/pyanaconda/core/threads.py", line 280, in run
Feb 16 14:39:45 localhost-live org.fedoraproject.Anaconda.Modules.Storage[3413]:     threading.Thread.run(self)
...
Feb 16 14:39:45 localhost-live org.fedoraproject.Anaconda.Modules.Storage[3413]:   File "/usr/lib/python3.12/site-packages/blivet/populator/populator.py", line 337, in handle_format
Feb 16 14:39:45 localhost-live org.fedoraproject.Anaconda.Modules.Storage[3413]:     helper_class(self, info, device).run()
Feb 16 14:39:45 localhost-live org.fedoraproject.Anaconda.Modules.Storage[3413]:   File "/usr/lib/python3.12/site-packages/blivet/populator/helpers/btrfs.py", line 86, in run
Feb 16 14:39:45 localhost-live org.fedoraproject.Anaconda.Modules.Storage[3413]:     raise DeviceTreeError("could not find parent for subvol")
Feb 16 14:39:45 localhost-live org.fedoraproject.Anaconda.Modules.Storage[3413]: blivet.errors.DeviceTreeError: could not find parent for subvol
Feb 16 14:39:45 localhost-live org.fedoraproject.Anaconda.Modules.Storage[3413]: INFO:anaconda.core.threads:Thread Done: AnaTaskThread-ScanDevicesTask-1 (140710692480704)
Feb 16 14:39:45 localhost-live org.fedoraproject.Anaconda.Modules.Storage[3413]: WARNING:dasbus.server.handler:The call org.fedoraproject.Anaconda.Task.Finish has failed with an exception:
Feb 16 14:39:45 localhost-live org.fedoraproject.Anaconda.Modules.Storage[3413]: Traceback (most recent call last):
Feb 16 14:39:45 localhost-live org.fedoraproject.Anaconda.Modules.Storage[3413]:   File "/usr/lib/python3.12/site-packages/dasbus/server/handler.py", line 455, in _method_callback
Feb 16 14:39:45 localhost-live org.fedoraproject.Anaconda.Modules.Storage[3413]:     result = self._handle_call(
Feb 16 14:39:45 localhost-live org.fedoraproject.Anaconda.Modules.Storage[3413]:              ^^^^^^^^^^^^^^^^^^
...
Feb 16 14:39:45 localhost-live org.fedoraproject.Anaconda.Modules.Storage[3413]: blivet.errors.DeviceTreeError: could not find parent for subvol
Feb 16 14:39:45 localhost-live anaconda[3370]: anaconda: core.threads: Thread Failed: AnaStorageThread (139699100317376)
Feb 16 14:39:45 localhost-live anaconda[3370]: anaconda: exception: running handleException
Feb 16 14:39:45 localhost-live anaconda[3370]: anaconda: exception: Traceback (most recent call last):

                                                 File "/usr/lib64/python3.12/site-packages/pyanaconda/core/threads.py", line 280, in run
                                                   threading.Thread.run(self)

                                                 File "/usr/lib64/python3.12/threading.py", line 989, in run
                                                   self._target(*self._args, **self._kwargs)

                                                 File "/usr/lib64/python3.12/site-packages/pyanaconda/ui/lib/storage.py", line 97, in reset_storage
                                                   sync_run_task(task_proxy)

                                                 File "/usr/lib64/python3.12/site-packages/pyanaconda/modules/common/task/__init__.py", line 46, in sync_run_task
                                                   task_proxy.Finish()

                                                 File "/usr/lib/python3.12/site-packages/dasbus/client/handler.py", line 450, in _call_method
                                                   return self._get_method_reply(
                                                          ^^^^^^^^^^^^^^^^^^^^^^^

                                                 File "/usr/lib/python3.12/site-packages/dasbus/client/handler.py", line 483, in _get_method_reply
                                                   return self._handle_method_error(error)
                                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

                                                 File "/usr/lib/python3.12/site-packages/dasbus/client/handler.py", line 509, in _handle_method_error
                                                   raise exception from None

                                               pyanaconda.modules.common.errors.general.AnacondaError: could not find parent for subvol

The thing is that parent isn't missing at all. The BTRFS file system does not have any errors. The parent ID is there.

root@racer:~# btrfs sub list /btrfsroot2
ID 258 gen 7219 top level 5 path @pool2
ID 375 gen 7219 top level 5 path @virt
...
ID 658 gen 6940 top level 654 path @backup-cds/tasking-mail
ID 659 gen 6957 top level 654 path @backup-cds/backup-flappy
ID 660 gen 7321 top level 654 path @backup-cds/backup-tardis
ID 661 gen 7320 top level 660 path @backup-cds/backup-tardis/.snapshots
...
keestux commented 6 months ago

@vojtechtrefny would it be possible to write up a few blivet steps that would simulate what anaconda is doing? So far I was not able to figure this out myself from looking at the blivet and anaconda code. Some simple thing I tried:

import blivet
dt = blivet.devicetree.DeviceTree()
dt.populate()
sda2 = dt.get_device_by_name("sda2")
BTRFS = blivet.formats.get_format('btrfs', exists=True)
btrfsroot = blivet.devices.btrfs.BTRFSDevice(parents=sda2, exists=True, fmt=BTRFS, uuid='17988bd4-4ad9-49c9-9cc7-73110966d8ab')
vol = blivet.devices.btrfs.BTRFSVolumeDevice(parents=btrfsroot, exists=True, fmt=BTRFS, uuid='17988bd4-4ad9-49c9-9cc7-73110966d8ab')

But this doesn't exactly do things that anaconda is doing. Anaconda is looking at all subvolumes. How does it do that? Because I think that there is where it goes wrong. Anaconda is creating BTRFSSubVolumeDevice instances in the wrong order. It hits a child subvol before it has seen the parent.

vojtechtrefny commented 6 months ago
import blivet
blivet.flags.flags.auto_dev_updates = True
b = blivet.Blivet()
b.reset()

Setting the auto_dev_updates to True is the key here -- by default blivet doesn't mount btrfs volumes to get the list of subvolumes (we don't want to mount random devices when running outside the installer environment). So with the flag set you should see all the subvolumes in the devicetree (or in your case, reset will probably crash), you can then use print(b.devicetree) for a simple device tree visualization.

keestux commented 6 months ago

Thanks for the tip. Unfortunately (if you can call it that) the problem has gone away. I'll close this issue for now and reopen it when it happens again.