Open hfp-ak opened 4 years ago
Hi Andreas, can you check the partition alignment? fdisk -l -u /dev/sdc7 on Linux. on Window you need to run msinfo32.exe, then go to Components > Storage > Disks, and look for the Partition Starting Offset field in the right part of the window. btw, what is the Windows OS version and architecture?
Best, Vadim.
I attached the output below. However, I cannot see why any alignment should make a difference on the physical (not logical) windows drive?
I did more tests. The sector count (122865120) remains exactly the same even if you use qemu-sata (userland emu in qemu), qemu-virtio-scsi (userland emu in qemu) or LIO/vhost. So the reduced sector count has nothing to do with LIO/vhost or virtio-scsi.
Windows is 1809 on amd64.
# fdisk -l -u /dev/sdc7
Disk /dev/sdc7: 58.6 GiB, 62914560000 bytes, 122880000 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xdcc36dd5
Device Boot Start End Sectors Size Id Type
/dev/sdc7p1 * 2048 1126399 1124352 549M 7 HPFS/NTFS/exFAT
/dev/sdc7p2 1126400 122877951 121751552 58.1G 7 HPFS/NTFS/exFAT
Description Disk drive
Manufacturer (Standard disk drives)
Model LIO-ORG win SCSI Disk Device
Bytes/Sector 512
Media Loaded Yes
Media Type Fixed hard disk
Partitions 2
SCSI Bus 0
SCSI Logical Unit 0
SCSI Port 1
SCSI Target ID 1
Sectors/Track 63
Size 58,59 GB (62.906.941.440 bytes)
Total Cylinders 7.648
Total Sectors 122.865.120
Total Tracks 1.950.240
Tracks/Cylinder 255
Partition Disk #0, Partition #0
Partition Size 549,00 MB (575.668.224 bytes)
Partition Starting Offset 1.048.576 bytes
Partition Disk #0, Partition #1
Partition Size 58,06 GB (62.336.794.624 bytes)
Partition Starting Offset 576.716.800 bytes
So the bug that really blocks me is with LUKS. Why is it not working over LIO/vhost?
Can you check Windows Event Log and see if there is any error reported by disk subsystem? I can only guess that LUKS or LIO/vhost might affect some of disk related parameters reported Windows. I will ask QE to take a look at the problem and check we officially support LUKS over LIO/vhost.
Vadim.
Using some help from the target-devel mailing list, I was able to track down the root cause. It is the following code in dm-crypt: https://elixir.bootlin.com/linux/v5.3.18/source/drivers/md/dm-crypt.c#L1251
It says that BIOs have to be aligned, which is not true for some SCSI commands coming from the virtio-scsi Windows driver. I clearly proved that by adding some debug code to the kernel.
So I think the first question is: Do BIOs have to be aligned? Is that the spec of them? And if yes, what can virtio-scsi do to be aligned in all cases.
Hi, I tried to reproduce this issue with follows steps, used virtio-scsi build 171, I can see the disk size in windows guest is different with shown in host, but cannot reproduce "disk cannot be formatted" issue. I'm not sure if I tested step is LUKS over LIO/vhost, please help correct it if incorrect, thanks~
Reproduce steps:
root@dell-per440-05 /home/kar # fdisk -l -u /dev/sdf Disk /dev/sdf: 2 GiB, 2147483648 bytes, 4194304 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 524288 bytes
Create luks image, due to luks have some header, so set size need smaller than disk size. qemu-img create --object secret,id=sec0,data=backing -f luks -o key-secret=sec0 /dev/sdb 9G qemu-img create --object secret,id=sec0,data=backing -f luks -o key-secret=sec0 /dev/sdf 1.99G
Boot win10-64 vm up with luks image: -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,num_queues=24,bus=pcie-root-port-2,addr=0x0 \ -blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/kvm_autotest_root/images/win10-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_image1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 \ -device scsi-hd,id=image1,drive=drive_image1,bootindex=0,write-cache=on \ -object secret,id=sec0,data=backing \ -blockdev driver=file,cache.direct=off,cache.no-flush=on,filename=/dev/sdf,node-name=my_file \ -blockdev driver=luks,node-name=my,file=my_file,key-secret=sec0 \ -device scsi-hd,drive=my \
Do disk format in guest disk management, both disk size is smaller than original disk size, and both disk can be formatted successfully. wmic diskdrive get size --/dev/sdb 10737418240 wmic diskdrive get size --/dev/sdf 213857280
And I also tested with disk passthrough device /dev/sdf directly(without luks), the disk size also is smaller than original disk size. wmic diskdrive get size --/dev/sdf 2146798080 qemu commands as: -blockdev driver=host_device,cache.direct=off,cache.no-flush=on,filename=/dev/sdf,node-name=my_file \ -blockdev driver=raw,node-name=my,file=my_file \ -device scsi-block,drive=my \
Best Regards~ Peixiu
Hi,
I ran into the same issue as what is reported here. I will give a more detailed steps on how to reproduce and what the root cause for the IO failures.
1) In the qemu host, format a disk with LUKS. In this test, I have chosen /dev/sdc1 which is 1G in size.
# cryptsetup -v luksFormat /dev/sdc1
2) Open the LUKS disk and assign a device mapper name to it.
# cryptsetup open /dev/sdc1 testcrypt
A new block device, /dev/mapper/testcrypt would be available for applications to use encrypted disk.
3) Using targetcli, create a LIO backstore device using /dev/mapper/testcrypt as its backend.
# targetcli /backstores/block create name=cryptdev dev=/dev/mapper/testcrypt
4) Create a vhost wwn using targetcli
# targetcli /vhost create
5) Map the block device created in step 3) to the vhost wwn created in step 4.
# targetcli /vhost/naa.50014055443ed648/tpg1/luns create /backstores/block/cryptdev
6) targetcli configuration now looks like this:
[root@ca-nfsdev4 debug]# targetcli ls o- / ......................................................................................................................... [...] o- backstores .............................................................................................................. [...] | o- block .................................................................................................. [Storage Objects: 1] | | o- cryptdev ......................................................... [/dev/mapper/testcrypt (1022.0MiB) write-thru activated] | | o- alua ................................................................................................... [ALUA Groups: 1] | | o- default_tg_pt_gp ....................................................................... [ALUA state: Active/optimized] | o- fileio ................................................................................................. [Storage Objects: 0] | o- pscsi .................................................................................................. [Storage Objects: 0] | o- ramdisk ................................................................................................ [Storage Objects: 0] o- iscsi ............................................................................................................ [Targets: 0] o- loopback ......................................................................................................... [Targets: 0] o- srpt ............................................................................................................. [Targets: 0] o- vhost ............................................................................................................ [Targets: 1] o- naa.50014055443ed648 .............................................................................................. [TPGs: 1] o- tpg1 .................................................................................. [naa.50014052a4c40b91, no-gen-acls] o- acls .......................................................................................................... [ACLs: 0] o- luns .......................................................................................................... [LUNs: 1] o- lun0 ...................................................... [block/cryptdev (/dev/mapper/testcrypt) (default_tg_pt_gp)] [root@ca-nfsdev4 debug]#
7) Now, boot a windows 2010 guest by attaching the vhost device that is created in step 6. Ensure the windows image has virtio drivers installed.
# qemu-system-x86_64 --enable-kvm -hda win2k10.img -m 16384 -rtc base=localtime,clock=host -smp cores=8,threads=16 -netdev user,id=user.0 -device e1000,netdev=user.0 -device vhost-scsi-pci,wwpn=naa.50014055443ed648
8) Format operation from windows would fail on the vhost-scsi device. At the same time, the qemu host would log the following bio errors to the console.
Jun 11 09:51:59 localhost kernel: bio error: ffff990b443e7100, err: 10 Jun 11 09:51:59 localhost kernel: bio error: ffff991d426cad00, err: 10 Jun 11 09:51:59 localhost kernel: bio error: ffff991d7c6ccf00, err: 10 Jun 11 09:51:59 localhost kernel: bio error: ffff991b6399b200, err: 10 Jun 11 09:51:59 localhost kernel: bio error: ffff990b443e7600, err: 10 Jun 11 09:51:59 localhost kernel: bio error: ffff991c03266800, err: 10 Jun 11 09:51:59 localhost kernel: bio error: ffff991d7c6ccf00, err: 10 Jun 11 09:51:59 localhost kernel: bio error: ffff990b5b469b00, err: 10 Jun 11 09:51:59 localhost kernel: bio error: ffff991b6399a800, err: 10 Jun 11 09:51:59 localhost kernel: bio error: ffff991b632fda00, err: 10 Jun 11 09:51:59 localhost kernel: bio error: ffff991b6399a800, err: 10 Jun 11 09:51:59 localhost kernel: bio error: ffff991b632fda00, err: 10 Jun 11 09:51:59 localhost kernel: bio error: ffff991b636f2f00, err: 10 Jun 11 09:51:59 localhost kernel: bio error: ffff991b632fda00, err: 10 Jun 11 09:51:59 localhost kernel: bio error: ffff991d3c0ddf00, err: 10
9) These bio errors are happening in dm-crypt because the buffer length is not a multiple of sector size which is 512 bytes.
static int crypt_convert_block_skcipher(struct crypt_config cc, struct convert_context ctx, struct skcipher_request *req, unsigned int tag_offset) {
....
....
/* Reject unexpected unaligned bio. */
if (unlikely(bv_in.bv_len & (cc->sector_size - 1)))
return -EIO;
....
....
}
static int crypt_convert_block_aead(struct crypt_config cc, struct convert_context ctx, struct aead_request *req, unsigned int tag_offset) {
....
....
/* Reject unexpected unaligned bio. */
if (unlikely(bv_in.bv_len & (cc->sector_size - 1)))
return -EIO;
....
....
}
10) Further debugging this issue in dm-crypt, found that, a SCSI Write request for 6656 bytes is received by vhost. 6656 is a multiple of 512 bytes, hence the overall length is good here. But, vhost receives the data buffer in 3 scatter gather buffers, the first buffer had a length of 1584 bytes, second had 4096 and the third had 976 bytes, giving the total of 6656 bytes in data-out buffer. When this bio request is handed over to dm_crypt, the above validation in step 9) fails, because the first buffer, 1584 is not sector aligned. Hence, the IO error. dm-crypt puts a requirement that each buffer's length should be a multiple of dm-crypt sector size which is not honored here. In contrast, the linux guest always ensures the individual buffer lengths are always multiples of sector size and hence this issue is not seen with linux virtio driver.
If windows virtio driver ensures that the individual scatter gather buffers are multiples of sector size, then this issue would be resolved.
Let me know if there are any debugging required on the vhost side in Linux.
Thanks Sudhakar
To isolate this problem further, I ran similar test but with iSCSI transport. That is, windows 2010 runs a iSCSI initiator to connect to a iSCSI target that is backed by dm-crypt block device. In this case, the format completed successfully. This could suggest that the windows SCSI stack might be fine in terms of how the sg buffers are constructed. The issue could be with either windows virtio driver or with vhost driver.
@ssudhakarp Can you please post output from qemu "info qtree" command. I mostly interested in vhost-scsi-pci related information? Can you give a try to virtio-scsi-pci instead of vhost? And what is windows 2010? Best regards, Vadim.
@vrozenfe 1) Output of "info qtree":
bus: pci.0
type PCI
dev: vhost-scsi-pci, id ""
vectors = 4 (0x4)
indirect_desc = true
event_idx = true
vhostfd = ""
wwpn = "naa.50014055443ed648"
num_queues = 1 (0x1)
max_sectors = 65535 (0xffff)
cmd_per_lun = 128 (0x80)
addr = 05.0
romfile = ""
rombar = 1 (0x1)
multifunction = false
command_serr_enable = true
class SCSI controller, addr 00:05.0, pci id 1af4:1004 (sub 1af4:0008)
bar 0: i/o at 0xc040 [0xc07f]
bar 1: mem at 0xfebf1000 [0xfebf1fff]
bus: virtio-bus
type virtio-pci-bus
dev: vhost-scsi, id ""
vhostfd = ""
wwpn = "naa.50014055443ed648"
num_queues = 1 (0x1)
max_sectors = 65535 (0xffff)
cmd_per_lun = 128 (0x80)
2) virtio-scsi-pci works fine. I believe, this is because, qemu is involved in the data path and hence the buffers are applied through file system interface which circumvents the alignment issue by dm-crypt. In case of vhost-scsi-pci, the qemu is not involved in the data path and the buffers are handled directly by vhost driver. This is my understanding on why it works with virtio-scsi-pci but not with vhost-scsi-pci. Correct me if i am wrong.
3) I meant windows 10.
Thanks Sudhakar
Hi @ssudhakarp ,
I want to reproduce this issue with your provide steps, but I hit some problems. When I do step4 Create a vhost wwn using targetcli, the command "targetcli /vhost create' report error "No such path /vhost". I checked targetcli ls, there isn't vhost shown in: [root@dell-per440-01 kar]# targetcli ls o- / ................................................................................. [...] o- backstores ...................................................................... [...] | o- block .......................................................... [Storage Objects: 1] | | o- cryptdev .................. [/dev/mapper/testcrypt (984.0MiB) write-thru activated] | | o- alua ........................................................... [ALUA Groups: 1] | | o- default_tg_pt_gp ............................... [ALUA state: Active/optimized] | o- fileio ......................................................... [Storage Objects: 0] | o- pscsi .......................................................... [Storage Objects: 0] | o- ramdisk ........................................................ [Storage Objects: 0] o- iscsi .................................................................... [Targets: 1] | o- iqn.2003-01.org.linux-iscsi.dell-per440-01.x8664:sn.a0afaadfd6cb .......... [TPGs: 1] | o- tpg1 ....................................................... [no-gen-acls, no-auth] | o- acls .................................................................. [ACLs: 0] | o- luns .................................................................. [LUNs: 1] | | o- lun0 .............. [block/cryptdev (/dev/mapper/testcrypt) (default_tg_pt_gp)] | o- portals ............................................................ [Portals: 1] | o- 0.0.0.0:3260 ............................................................. [OK] o- loopback ................................................................. [Targets: 0]
I tried to load tcm_vhost module, but it's failed, error as follows: [root@dell-per440-01 kar]# modprobe tcm_vhost modprobe: FATAL: Module tcm_vhost not found in directory /lib/modules/4.18.0-240.11.1.el8_3.x86_64
Could you help to have a look?
I used versions: kernel-4.18.0-240.11.1.el8_3.x86_64 qemu-kvm-5.1.0-17.module+el8.3.1+9213+7ace09c3.x86_64 seabios-bin-1.14.0-1.module+el8.3.0+7638+07cf13d2.noarch targetcli-2.1.fb49-1.el8.noarch
Thanks in advance!! Peixiu
I've similar issue:
Funny part is that it worked first time but failed to install, then It failed on doing diskpart clear
after reboot - I did wipefs from host. Then.. This from screenshot when clicked create partition.
Your kernel is probably too old @peixiu I used 5.10.16 from Arch Linux repository with qemu 5.2.0, ovmf and windows 10 20h2, virtio-win.iso 190
This are commands I used to create this, my LV is on top of dm-crypt.
$ sudo targetcli
targetcli shell version 2.1.53
Copyright 2011-2013 by Datera, Inc and others.
For help on commands, type 'help'.
/backstores/fileio> /backstores/block create win10test /dev/mobile-arch/
/dev/mobile-arch/macos-vm /dev/mobile-arch/root /dev/mobile-arch/swap /dev/mobile-arch/win10-vm
/dev/mobile-arch/win10test /dev/mobile-arch/win98-vm
/backstores/fileio> /backstores/block create win10test /dev/mobile-arch/win10test
Created block storage object win10test using /dev/mobile-arch/win10test.
/backstores/fileio> /backstores/block/win10test set attribute is_nonrot=1
Parameter is_nonrot is now '1'.
/backstores/fileio> /vhost/ create
Created target naa.5001405b2917d1a5.
Created TPG 1.
/backstores/fileio> /vhost/naa.5001405b2917d1a5/tpg1/luns
@last bookmarks cd create delete exit get help ls pwd refresh set
status
/backstores/fileio> /vhost/naa.5001405b2917d1a5/tpg1/luns create /backstores/block/win10test
Created LUN 0.
exitkstores/fileio>
Global pref auto_save_on_exit=true
Last 10 configs saved in /etc/target/backup/.
Configuration saved to /etc/target/saveconfig.json
$ systemctl start target
$ virsh attach-device win10test /dev/stdin --persistent << EOL
<hostdev mode='subsystem' type='scsi_host'>
<source protocol='vhost' wwpn='naa.5001405b2917d1a5'/>
</hostdev>
EOL
Hi JuniorJPDJ,
I found the /vhost option when test with kernel-5.11.0-1.el9.x86_64 and qemu-kvm-5.2.0-7.el9.x86_64, thank you~ But when I execute "targetcli /vhost create" command, hit error "Could not create VhostFabricModule in configFS", If need FC card connected on the host? or any other settings need to be done?
[root@ibm-x3250m6-06 home]# targetcli /vhost create Could not create VhostFabricModule in configFS
Thanks a lot~ Peixiu
It's purely software, no additional hardware needed if virtualization is supported on host. You may need to modprobe some module. I don't know if it wouldn't be easier to try on some other distro, even in VM with virt-nesting enabled.
Hi all,
I reproduced this issue with kernel-5.9.16-200.fc33.x86_64 and qemu-kvm-5.1.0-8.fc33.x86_64, I tested on my laptop os.
1) Tested with "-device vhost-scsi-pci,wwpn=naa.5001405b8504f669 ", the disk cannot be formatted. 2) Tried test with "-device virtio-scsi-pci,id=virtio_scsi_pci0 -blockdev driver=host_device,cache.direct=off,cache.no-flush=on,filename=/dev/sdb,node-name=my_file -blockdev driver=raw,node-name=my,file=my_file -device scsi-block,drive=my" , the disk can be formatted successfully. Both tests with same disk /dev/sdb and same vioscsi driver version(virtio-win-prewhql-196).
For 1) test, detail steps as follows:
On latest RHEL8.4.0 host, there is not /vhost support in targetcli; On latest RHEL9 host, "targetcli /vhost create" will also hit "Could not create VhostFabricModule in configFS" error, I cannot find some way to solve this problem, so I have to use fedora to reproduce the issue. And with RHEL8/RHEL9 qemu-kvm version, under /usr/libexec/qemu-kvm --device help, I don't find the vhost-scsi-pci device support~ @vrozenfe About this situation, if we need to file a new bug on Bugzilla to track this issue?
Thanks all~ Peixiu
I am using virtio-scsi (0.1.171-1) with the LIO/vhost backend. That works great with Linux and with Windows in normal cases. With normal I mean that targetcli has an iblock object for lets say /dev/sdc7 and a vhost target with a lun that references the iblock object.
However, on Windows the disk size differs from what Linux says about /dev/sdc7.
Windows:
So for some reason the physical disk in Windows has less sectors than the Linux device? Do you know what is going on here?
When I then want to use LUKS/cryptsetup and create a /dev/mapper/crypt and pass that as iblock I cannot even format that disk using Windows (works perfectly with Linux as a qemu guest).
With Windows I get tons of errors in the host dmesg:
Any idea about that?
Regards Andreas