elastio / elastio-snap

kernel module for taking block-level snapshots and incremental backups of Linux block devices
GNU General Public License v2.0
19 stars 7 forks source link

refcnt becomes -1 at the snapshot creation on XFS #163

Closed skypodolsky closed 2 years ago

skypodolsky commented 2 years ago

Briefly the case: a process called xfs_db (XFS debugging tool) obtained the original block device (dm-4) right before we switched the bdev fops (file operations). During this switch, we also change the ownership of our driver. This, in its turn, makes xfs_db execute blkdev_put() (with module_put() underneath) for elastio_snap (instead of the xfs driver), causing its refcnt to become -1. Here you may see the behavior (pay attention to the last line):

# tracer: function
#
# entries-in-buffer/entries-written: 32/32   #P:8
#
#                                _-----=> irqs-off
#                               / _----=> need-resched
#                              | / _---=> hardirq/softirq
#                              || / _--=> preempt-depth
#                              ||| / _-=> migrate-disable
#                              |||| /     delay
#           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
#              | |         |   |||||     |         |
         elioctl-1281    [007] .....   107.640097: blkdev_get_by_dev <-blkdev_get_by_path
         elioctl-1281    [007] .....   107.640110: blkdev_put <-__ioctl_setup
         elioctl-1281    [007] .....   107.640119: blkdev_get_by_dev <-blkdev_get_by_path
         elioctl-1281    [007] .....   107.641435: __tracer_transition_tracing <-__tracer_setup_tracing
         elioctl-1281    [007] .....   107.641444: freeze_bdev <-__tracer_transition_tracing
   systemd-udevd-1256    [002] .....   107.641554: blkdev_get_by_dev <-blkdev_open
         losetup-1279    [003] .....   107.646149: blkdev_get_by_dev <-blkdev_open
         losetup-1279    [003] .....   107.646177: blkdev_put <-blkdev_close
         losetup-1279    [003] .....   107.646227: blkdev_get_by_dev <-blkdev_open
         losetup-1279    [003] .....   107.646248: blkdev_put <-blkdev_close
         losetup-1279    [003] .....   107.646297: blkdev_get_by_dev <-blkdev_open
         losetup-1279    [003] .....   107.646317: blkdev_put <-blkdev_close
    probe-bcache-1284    [007] .....   107.646348: blkdev_get_by_dev <-blkdev_open
         losetup-1279    [003] .....   107.646365: blkdev_get_by_dev <-blkdev_open
         losetup-1279    [003] .....   107.646386: blkdev_put <-blkdev_close
         losetup-1279    [003] .....   107.646434: blkdev_get_by_dev <-blkdev_open
         losetup-1279    [003] .....   107.646454: blkdev_put <-blkdev_close
    probe-bcache-1284    [007] .....   107.649390: blkdev_put <-blkdev_close
         losetup-1279    [003] .....   107.678137: blkdev_get_by_dev <-blkdev_open
         losetup-1279    [003] .....   107.678165: blkdev_put <-blkdev_close
         losetup-1279    [003] .....   107.678216: blkdev_get_by_dev <-blkdev_open
         losetup-1279    [003] .....   107.678236: blkdev_put <-blkdev_close
         losetup-1279    [003] .....   107.678287: blkdev_get_by_dev <-blkdev_open
         losetup-1279    [003] .....   107.678315: blkdev_put <-blkdev_close
         losetup-1279    [003] .....   107.678369: blkdev_get_by_dev <-blkdev_open
         losetup-1279    [003] .....   107.678384: blkdev_put <-blkdev_close
          xfs_db-1296    [002] .....   107.695945: blkdev_get_by_dev <-blkdev_open
          xfs_db-1296    [002] .....   107.695961: blkdev_get_by_dev <-blkdev_open
          xfs_db-1296    [002] .....   107.696084: blkdev_put <-blkdev_close
   systemd-udevd-1256    [003] .....   107.719393: blkdev_put <-blkdev_close
         elioctl-1281    [000] .....   107.750171: thaw_bdev <-__tracer_transition_tracing
          xfs_db-1296    [002] .....   107.750307: blkdev_put <-blkdev_close

We can also look here real quick:

          insmod-11822   [004] ....   954.508031: module_load: elastio_snap OE
          insmod-11822   [004] ....   954.508183: module_put: elastio_snap call_site=do_init_module refcnt=1
           mount-11834   [007] ....   958.138380: module_get: xfs call_site=__get_fs_type refcnt=2
           mount-11834   [007] ....   958.138402: module_get: xfs call_site=get_filesystem refcnt=3
           mount-11834   [007] ....   958.138406: module_put: xfs call_site=put_filesystem refcnt=2
           mount-11834   [007] ....   958.138439: module_get: xfs call_site=get_filesystem refcnt=3
           mount-11834   [004] ....   958.141088: module_put: xfs call_site=put_filesystem refcnt=2
         elioctl-11846   [006] ....   958.148885: module_get: elastio_snap call_site=misc_open refcnt=2
   systemd-udevd-11824   [006] ....   958.154247: module_get: elastio_snap call_site=blkdev_get_no_open refcnt=3
    probe-bcache-11850   [002] ....   958.156304: module_get: elastio_snap call_site=blkdev_get_no_open refcnt=4
    probe-bcache-11850   [002] ....   958.157496: module_put: elastio_snap call_site=blkdev_put_no_open refcnt=3
   systemd-udevd-11824   [006] ....   958.189825: module_put: elastio_snap call_site=blkdev_put_no_open refcnt=2
         elioctl-11846   [000] ....   958.287771: module_put: elastio_snap call_site=__fput refcnt=1
           <...>-11872   [001] ....   958.287802: module_put: elastio_snap call_site=blkdev_put_no_open refcnt=0
          umount-11881   [000] ....   958.327233: module_put: xfs call_site=put_filesystem refcnt=1

You can see the line where the refcnt becomes 0. This is the root cause of the issue: if you count module_get() and module_put() functions for the elastio_snap driver, you will find 3 and 4 functions respectively.

====== SOLUTION =======

Due to the specifics of the product and the XFS filesystem architecture, we cannot guarantee that no entity will be holding the block device when we change the fops. This, in its turn, implies that we should not change the owner of the block device, leaving it to the parent driver.

Resolves #149