Briefly the case: a process called xfs_db (XFS debugging tool) obtained the original block device (dm-4) right before we switched the bdev fops (file operations). During this switch, we also change the ownership of our driver. This, in its turn, makes xfs_db execute blkdev_put() (with module_put() underneath) for elastio_snap (instead of the xfs driver), causing its refcnt to become -1. Here you may see the behavior (pay attention to the last line):
You can see the line where the refcnt becomes 0. This is the root cause of the issue: if you count module_get() and module_put() functions for the elastio_snap driver, you will find 3 and 4 functions respectively.
====== SOLUTION =======
Due to the specifics of the product and the XFS filesystem architecture, we cannot guarantee that no entity will be holding the block device when we change the fops. This, in its turn, implies that we should not change the owner of the block device, leaving it to the parent driver.
Briefly the case: a process called xfs_db (XFS debugging tool) obtained the original block device (
dm-4
) right before we switched the bdev fops (file operations). During this switch, we also change the ownership of our driver. This, in its turn, makesxfs_db
executeblkdev_put()
(withmodule_put()
underneath) forelastio_snap
(instead of the xfs driver), causing its refcnt to become -1. Here you may see the behavior (pay attention to the last line):We can also look here real quick:
You can see the line where the refcnt becomes 0. This is the root cause of the issue: if you count
module_get()
andmodule_put()
functions for theelastio_snap
driver, you will find 3 and 4 functions respectively.====== SOLUTION =======
Due to the specifics of the product and the XFS filesystem architecture, we cannot guarantee that no entity will be holding the block device when we change the
fops
. This, in its turn, implies that we should not change the owner of the block device, leaving it to the parent driver.Resolves #149