virtio-win / kvm-guest-drivers-windows

Windows paravirtualized drivers for QEMU\KVM
https://www.linux-kvm.org/page/WindowsGuestDrivers
BSD 3-Clause "New" or "Revised" License
2.01k stars 386 forks source link

vioscsi (generic/block): does not work with xhci hosts that accept only read/write of 512 blocks #498

Closed tomty89 closed 2 years ago

tomty89 commented 4 years ago

For example, with https://github.com/torvalds/linux/commit/ec37198acca7b4c17b96247697406e47aafe0605, uas drives connected an ASM1142 will have max_sectors being 512 blocks instead 1024 blocks.

However, the Windows guest will insist on sending read/write commands of larger payload (e.g. 520 blocks). Sense data of the failure is apparently ignored (or maybe it's malformed) and the commands are kept being retried. I tried to set max_sectors of virtio-scsi-pci to 512 or lower and it does not help.

Currently qemu can manage to overwrite max_xfer_len and opt_xfer_len in the block limit VPD with the max_sectors set in the linux host (the non-qemu-specific queue limit) if scsi-generic is used. With scsi-block (only) max_xfer_len is overwritten to MAX_INT (opt_xfer_len in the verbatim VPD is 65535). I don't know if the fields matter at all in this issue. Neither of them works anyway.

scsi-hd seems to work fine.

I think it is supposed to work as long as I match the max_sectors of virtio-scsi-pci to the queue limit of the drive. Does the vioscsi driver care about it (like the scsi driver in a linux guest) at all?

vrozenfe commented 4 years ago

@tomty89 Thank you for reporting this issue.

When it comes to scsi related stuff, including VPD_BLOCK_LIMITS page , vioscsi driver fully relied on data provided by qemu. vioscsi driver reads max_sectors information from scsi_config structure, but doesn't use it for any kind of sanity checks or internal adjustments.

Definitely, it wouldn't be a problem to intercept VPD_BLOCK_LIMITS call and adjust MaximumTransferLength, but currently I'm not sure if it has to be handled by a guest driver, rather than by qemu.

In any case, please provide information about qemu/kernel/virtio drivers versions along with qemu command line to help our QE reproducing the problem.

Best, Vadim.

tomty89 commented 4 years ago

@tomty89 Thank you for reporting this issue.

When it comes to scsi related stuff, including VPD_BLOCK_LIMITS page , vioscsi driver fully relied on data provided by qemu. vioscsi driver reads max_sectors information from scsi_config structure, but doesn't use it for any kind of sanity checks or internal adjustments.

Are you implying that this could be a compatibility issue of Windows itself? Can you point me to the code where it exposes/reports the max_sectors parameter set for virtio-scsi-pci in the qemu command line (if it does so / if Windows uses such value like Linux to set the queue limit)?

Ref.: https://github.com/torvalds/linux/blob/v5.8/drivers/scsi/virtio_scsi.c#L871 https://github.com/torvalds/linux/blob/v5.8/drivers/scsi/scsi_lib.c#L1800

Definitely, it wouldn't be a problem to intercept VPD_BLOCK_LIMITS call and adjust MaximumTransferLength, but currently I'm not sure if it has to be handled by a guest driver, rather than by qemu.

I wasn't suggesting the driver should do so anyway. qemu has already being doing that in its scsi-generic.c (which applies to both scsi-block and scsi-generic). It's just that for some reason (I don't know if it's like expected or a bug/flaw) the function that gets the queue limit only works on the sg driver.

The thing is overwriting the BL VPD (I mean, with scsi-generic, or do some hard-code patching for scsi-block, so that the values would be the queue limit) doesn't seem to help with this issue anyway (I checked the new values in the guests with sg_vpd), so I wonder if the VPD matters in Windows at all.

In any case, please provide information about qemu/kernel/virtio drivers versions along with qemu command line to help our QE reproducing the problem.

qemu 5.1.0 linux 5.8.5 virtio driver iso 0.1.189

Best, Vadim.

tomty89 commented 4 years ago

It seems to me that this should to be set to adaptExt->scsi_config.max_sectors: https://github.com/virtio-win/kvm-guest-drivers-windows/blob/mm201/vioscsi/vioscsi.c#L507

This is not probably not strictly necessary if/when qemu can get the host side queue limit and report it via the BL VPD (AND if Windows cares about the values). It doesn't seem like the latter is the case though (maybe I'm missing something; it would surprise me a lot if it really isn't; or maybe we missed something in the driver so that Windows ignores the VPD?).

Or maybe there's alternate approach to limit read/write length on Windows?

Ref.: https://bugs.launchpad.net/qemu/+bug/1893634

P.S. Would be great if anyone can provide a test build for me to confirm that my proposal can work around this issue anyway.

tomty89 commented 4 years ago

I just tested on Intel (Haswell) XHCI with the following patch to confirm it can be solely triggered by the max_sectors queue limit:

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 7ec91c3a66ca..e9a675bbcdfb 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -448,7 +448,7 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
        if (sht->max_sectors)
                shost->max_sectors = sht->max_sectors;
        else
-               shost->max_sectors = SCSI_DEFAULT_MAX_SECTORS;
+               shost->max_sectors = 512;

        if (sht->max_segment_size)
                shost->max_segment_size = sht->max_segment_size;

To be more precise, it's only reproducible when the "hard" (hw_max_sectors) limit is 512 blocks. Lowering the "soft" limit with the corresponding mutable sysfs file doesn't trigger the issue.

I have yet to test with a BOT drive. For some reason my 16GB thumb drive is ignored/omitted (on Windows Setup) when it's passed to the guest via virtio-scsi / scsi-block (It is not when passed via ide; only a warning is shown).

In any case, to lower the hw_max_sectors, either this patch needs to be applied (for XHCI_NO_64BIT_SUPPORT controllers):

diff --git a/drivers/usb/storage/scsiglue.c b/drivers/usb/storage/scsiglue.c
index f4c2359abb1b..56c3f4deacf5 100644
--- a/drivers/usb/storage/scsiglue.c
+++ b/drivers/usb/storage/scsiglue.c
@@ -92,7 +92,7 @@ static int slave_alloc (struct scsi_device *sdev)
 static int slave_configure(struct scsi_device *sdev)
 {
        struct us_data *us = host_to_us(sdev->host);
-       struct device *dev = us->pusb_dev->bus->sysdev;
+       struct device *dev = sdev->host->dma_dev;

        /*
         * Many devices have trouble transferring more than 32KB at a time,

Or the hard-coded "optimum" (e.g. 2048 for SS drives) needs to be changed.

tomty89 commented 4 years ago

It seems to me that this should to be set to adaptExt->scsi_config.max_sectors: https://github.com/virtio-win/kvm-guest-drivers-windows/blob/mm201/vioscsi/vioscsi.c#L507

Meh, never mind. Apparently MaximumTransferLength is in byte (which is basically insane): https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/storport/ns-storport-_port_configuration_information. So there's no clean way using it. Hurray.

I suppose the focus should be why the BL VPD is apparently ignored anyway though.

vrozenfe commented 4 years ago

@tomty89

If you need it, I can make some private build trimming the maximum transfer length and number of physical breaks according to max_sectors size.

Best, Vadim.

tomty89 commented 4 years ago

@vrozenfe I'm more curious in what determines the "break size" (if/when it's not max_xfer_len on the VPD), like why is it e.g. 520 blocks instead of e.g. 512 blocks...

I can test if MaximumTransferLength has any effect on this if you can provide me a build. It's just that you can at best assume a block size for it. It won't really provide a solution ultimately regardless of the result.

Btw, apparently the reduction of hw_max_sectors due to https://github.com/torvalds/linux/commit/ec37198acca7b4c17b96247697406e47aafe0605 is a regression anyway. Patches have been sent to fix it. Although that doesn't mean in no case this would be triggered again.

vrozenfe commented 4 years ago

NumberOfPhysicalBreaks (https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/storport/ns-storport-_port_configuration_information)
https://github.com/virtio-win/kvm-guest-drivers-windows/blob/mm201/vioscsi/vioscsi.c#L504 in the number of SG Elements (roughly speaking maximum DMA transfer size in pages) in SGList that miniport can send to underlying HW at once https://github.com/virtio-win/kvm-guest-drivers-windows/blob/mm201/vioscsi/vioscsi.c#L1351

tomty89 commented 4 years ago

Ahh, so 520 comes from ((64 + 1) * 4096 / 512)...Any particular reason for the +1?

Then I think adaptExt->scsi_config.seg_max should at most be clamped (instead of set) to MAX_PHYS_SEGMENTS (+1 or not), so that it is at least possible to be lowered with virtqueue_size (and ConfigInfo->NumberOfPhysicalBreaks should be derived from it).

Not sure what adaptExt->max_physical_breaks is for and if there's a reason that it is MAX_PHYS_SEGMENTS instead of MAX_PHYS_SEGMENTS + 1. It seems to be a/the reason that ConfigInfo->NumberOfPhysicalBreaks is not "inheriting" from adaptExt->scsi_config.seg_max. (If necessary, can it be adaptExt->scsi_config.seg_max - 1 instead?)

P.S. I wonder if #448 is actually a dup of this...

Edit: So Windows really doesn't care about the SCSI target's VPD? (But only the host's driver?)

vrozenfe commented 4 years ago

+1 is a common practice when programming DMA, just because the first block can be not page aligned you can try redefining the number of physical breaks in the Registry by adjusting the following key

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\vioscsi\Parameters\Device] "PhysicalBreaks"=dword:000000ff

I'm still not sure about https://github.com/virtio-win/kvm-guest-drivers-windows/issues/448 This issue needs some more investigation.

tomty89 commented 4 years ago

@vrozenfe But then what's the reason for adaptExt->max_physical_breaks to omit the practice?

It's particularly odd that by the name ConfigInfo->NumberOfPhysicalBreaks should simply be equal to adaptExt->max_physical_breaks but it isn't.

Is it possible that the right thing to do is we clamp adaptExt->scsi_config.seg_max to MAX_PHYS_SEGMENTS and have both physical breaks set to that +1? That would seem to make much more sense to me at least.

tomty89 commented 4 years ago

Besides, I'm not sure if the +1 practice has a point unless any of the variables always has to be derived from a static value (MAX_PHYS_SEGMENTS)...

P. S. And what's the point of using a max when we ultimately uses the max+1...

P.S. Maybe I should check why qemu does some-2...

vrozenfe commented 4 years ago

a bit more why it is +1

https://github.com/Microsoft/Windows-driver-samples/blob/826f713a1b51e0cb18d28d493ceb3990f9c94305/storage/class/classpnp/src/xferpkt.c#L70 https://github.com/Microsoft/Windows-driver-samples/blob/826f713a1b51e0cb18d28d493ceb3990f9c94305/storage/class/classpnp/src/class.c#L3438

tomty89 commented 4 years ago

I opened a PR for this: https://github.com/virtio-win/kvm-guest-drivers-windows/pull/502

I tried to explain the heuristic in my (perhaps satirical) comments. Feel free to ask me if it is confusing at any point.

JonKohler commented 2 years ago

I'll address this as part of fix for https://github.com/virtio-win/kvm-guest-drivers-windows/issues/735