ublk-org / ublksrv

ublk: userspace block device driver
MIT License
147 stars 50 forks source link

tgt_loop: hardcoded 512 logical block size causes zram unaligned I/O failures #14

Closed ddiss closed 2 years ago

ddiss commented 2 years ago

Linux zram devices advertise a 4k logical block size by default:

# modprobe zram num_devices="1"
# echo "1G" > /sys/block/zram0/disksize
# cat /sys/devices/virtual/block/zram0/queue/logical_block_size 
4096

When using the tgt_loop backend to expose a 4k zram device via ublk, the logical block size is advertised as 512 bytes:

# ublk add -t loop -f /dev/zram0
# cat /sys/devices/virtual/block/ublkb0/queue/logical_block_size 
512

This results in 512 aligned I/O attempts from the block layer, instead of the 4k aligned request required by zram, causing I/O failures:

rapido1:/# mkdir /mnt && mkfs.xfs /dev/ublkb0 
meta-data=/dev/ublkb0            isize=512    agcount=4, agsize=65536 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=0    bigtime=0 inobtcount=0
data     =                       bsize=4096   blocks=262144, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
Discarding blocks...Done.
rapido1:/# mount /dev/ublkb0 /mnt/
[   62.907137] I/O error, dev ublkb0, sector 0 op 0x0:(READ) flags 0x1000 phys_seg 1 prio class 2
[   62.908494] XFS (ublkb0): SB validate failed with error -5.
mount: /mnt: can't read superblock on /dev/ublkb0.

Changing the hardcoded struct ublk_params.basic.logical_bs_shift values to advertise 4k alignment avoids this error, e.g.:

--- a/tgt_loop.cpp
+++ b/tgt_loop.cpp
@@ -55,15 +55,15 @@ static int loop_init_tgt(struct ublksrv_dev *dev, int type, int argc, char
        struct ublk_params p = {
                .types = UBLK_PARAM_TYPE_BASIC | UBLK_PARAM_TYPE_DISCARD,
                .basic = {
-                       .logical_bs_shift       = 9,
+                       .logical_bs_shift       = 12,
                        .physical_bs_shift      = 12,
                        .io_opt_shift   = 12,
-                       .io_min_shift   = 9,
-                       .max_sectors            = info->max_io_buf_bytes >> 9,
+                       .io_min_shift   = 12,
+                       .max_sectors            = info->max_io_buf_bytes >> 12,
                },

It'd be nice if ublk tgt_loop obtained the I/O alignment requirements from the underlying loopback device and advertised the same values, or alternatively supported a blocksize parameter with ublk add.

BTW, thanks for working on ublk - it's an impressive feature :-)

ming1 commented 2 years ago

t'd be nice if ublk tgt_loop obtained the I/O alignment requirements from the underlying >loopback device and advertised the same values, or alternatively supported a blocksize >parameter with ublk add.

Yeah, it should be easy to figure out these underlying parameters via ioctl(blk device), and just copy the kernel loop's logic.

Feel free to cook a patch!

ming1 commented 2 years ago

Fixed by cfc561a ("tgt_loop: set correct block size if backing file is block device")

ddiss commented 2 years ago

Thanks for the fix! This covers the tgt_loop(zram) case, but there's still an issue with file-backed direct-io for tgt_loop(zram->XFS->file). It could be similarly handled with something like:

--- a/tgt_loop.cpp
+++ b/tgt_loop.cpp
@@ -110,6 +110,8 @@ static int loop_init_tgt(struct ublksrv_dev *dev, int type, int argc, char
                can_discard = backing_supports_discard(file);
        } else if (S_ISREG(st.st_mode)) {
                bytes = st.st_size;
+               p.basic.logical_bs_shift = ilog2(st.st_blksize);
+               p.basic.physical_bs_shift = ilog2(st.st_blksize);
                can_discard = true;
        } else {
                bytes = 0;

Looking at kernel loopback, it appears to disable directio for cases like this.

ming1 commented 2 years ago

Yeah, you are right.

I think buffered io option should be provided as one command line too for ublk-loop.

And if the underlying file/device can't support dio, buffered io needs to be switched to.

However, it is much easier to do this kind of thing in userspace than kernel.

I will submit one patch after related details are figured out.

Thanks,

ming1 commented 2 years ago

Now the issue has been addressed by:

bf54b6b tgt_loop: improve for handling loop direct io

ddiss commented 2 years ago

Now the issue has been addressed by:

bf54b6b tgt_loop: improve for handling loop direct io

Works for me - thanks. FWIW, one helpful statx change which may be relevant in the future is: https://lore.kernel.org/lkml/20220827065851.135710-6-ebiggers@kernel.org/T/