raspberrypi / Raspberry-Pi-OS-64bit

Repository for containing issues on the 64 bit operating system (as distinct from the 32 bit one)
466 stars 21 forks source link

RNDIS USB Gadget fails to function #127

Open kuanyili opened 3 years ago

kuanyili commented 3 years ago

https://www.factoryforward.com/pi-zero-w-headless-setup-windows10-rndis-driver-issue-resolved/

This tutorial also works on: Raspberry Pi 4 Model B (connected to Windows 10 PC through USB-C) + Raspberry Pi OS 32bit, but not with Raspberry Pi OS 64bit, presumably since kernel 5.x.

On Raspberry Pi, you get this in dmesg:

[  477.225918] rndis_msg_parser: unknown RNDIS message 0x0052030C len 4456526
[  477.225935] RNDIS command error -524, 24/24

And below message on Windows:

This device cannot start. (Code 10)

{Operation Failed.}
The requested operation was unsuccessful.

Note: Instead of using RNDIS protocol, Linux/macOS use CDC-ECM protocol by default to communicate with g_ether, which works without any error.

manhalt commented 3 years ago

I can confirm this, but I'm not sure why. I tested it on rpi-5.10.y kernel and it also exists in 5.4. I checked several files related to the driver (like https://github.com/raspberrypi/linux/tree/rpi-5.10.y/drivers/usb/gadget/function rndis.c/h) and they are the same as upstream and on Ubuntu 20.10. The interesting thing here is, that it works on Ubuntu arm64 with a 5.8 kernel. So it is not directly related to the driver or arm 64 architecture. Maybe someone else has more ideas where to start digging?

rbilovol commented 3 years ago

The issue is related to Raspberry Pi 4 DWC2 USB controller's driver which doesn't work correctly in some cases with DMA. I did some investigation already and sent en email to linux-usb mailing list: https://www.spinics.net/lists/kernel/msg3816513.html

Meanwhile, you can try to disable as a workaround DWC2 DMA by next patch:

-------------------------------------8<----------------------------------------
>From ced7a3631d9800d04bcbcd756dac4583459fe48c Mon Sep 17 00:00:00 2001
From: Ruslan Bilovol <ruslan.bilovol@xxxxxxxxx>
Date: Wed, 20 Jan 2021 00:27:52 +0200
Subject: [PATCH] usb: dwc2: workaround: disable DMA for gadget

On Raspberry PI 4 it was observer that in case of control
transfers with DATA phase from a host, the driver for some
reason doesn't copy transferred data to the buffer, leaving
previous data in it.

With disabled DMA the issue isn't reproducible, thus
temporarily disable it

Signed-off-by: Ruslan Bilovol <ruslan.bilovol@xxxxxxxxx>
---
 drivers/usb/dwc2/params.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/usb/dwc2/params.c b/drivers/usb/dwc2/params.c
index 267543c..46c18af 100644
--- a/drivers/usb/dwc2/params.c
+++ b/drivers/usb/dwc2/params.c
@@ -357,7 +357,11 @@ static void dwc2_set_default_params(struct
dwc2_hsotg *hsotg)
 {
  struct dwc2_hw_params *hw = &hsotg->hw_params;
  struct dwc2_core_params *p = &hsotg->params;
+#if 0
  bool dma_capable = !(hw->arch == GHWCFG2_SLAVE_ONLY_ARCH);
+#else
+ bool dma_capable = 0;
+#endif

  dwc2_set_param_otg_cap(hsotg);
  dwc2_set_param_phy_type(hsotg);
@@ -651,7 +655,11 @@ static void dwc2_check_params(struct dwc2_hsotg *hsotg)
 {
  struct dwc2_hw_params *hw = &hsotg->hw_params;
  struct dwc2_core_params *p = &hsotg->params;
+#if 0
  bool dma_capable = !(hw->arch == GHWCFG2_SLAVE_ONLY_ARCH);
+#else
+ bool dma_capable = 0;
+#endif

  dwc2_check_param_otg_cap(hsotg);
  dwc2_check_param_phy_type(hsotg);
-- 
1.9.1
timg236 commented 3 years ago

Should be resolved in the latest firmware release https://github.com/raspberrypi/firmware/commit/496e65477e06172ea20602e365d3790632c3cc06

manhalt commented 3 years ago

@timg236 Unfortunately, it is not resolved. Bug is still in place, even with update to 5.10.14 (but only on arm64).

@rbilovol I tested your patch by disabling DMA in DWC2 and it will fix the issue. But it will at least on my system lead to no activity on the RNDIS interface. It shows up on both ends (Pi and Windows machine) but will not transfer any data. Did you test this?

rbilovol commented 3 years ago

@manhalt switching off DMA may lead to another issues since DWC2 seems to be not tested well (if ever tested) without it. I originally noticed this issue while developing new features for USB Audio gadget on RPi4, and was able to easily reproduce the issue with RNDIS gadget. It seems that fixing DMA issue should be much simpler solution rather than fixing all bugs that appear after switching off DMA comletely

tatodorov commented 3 years ago

I can confirm this, but I'm not sure why. I tested it on rpi-5.10.y kernel and it also exists in 5.4. I checked several files related to the driver (like https://github.com/raspberrypi/linux/tree/rpi-5.10.y/drivers/usb/gadget/function rndis.c/h) and they are the same as upstream and on Ubuntu 20.10. The interesting thing here is, that it works on Ubuntu arm64 with a 5.8 kernel. So it is not directly related to the driver or arm 64 architecture. Maybe someone else has more ideas where to start digging?

Just tried it with Ubuntu 20.10 and I face the same problem. The network interface on Windows 10 is not active and this is from my dmesg:

[    6.088017] kernel: rndis_msg_parser: unknown RNDIS message 0x004B0209 len -1073741310
[    6.088023] kernel: RNDIS command error -524, 24/24

From the 64-bit kernels I've tried (4.19, 5.4, 5.8), the one that works for me is branch rpi-4.19.y https://github.com/raspberrypi/linux.git.

pavhofman commented 3 years ago

I am hitting the same problem on the audio gadget (data in control messages are not passed when DMA enabled). Interesting - RPi4 1GB RAM works perfectly. RPi4 4GB RAM with the same SD card, the same DWC2 IP version and configuration fails (in DMA mode).

GSNPSID = 0x4f54280a GHWCFG1 = 0x00000000 GHWCFG2 = 0x228ddd50 GHWCFG3 = 0x0ff000e8 GHWCFG4 = 0x1ff00020

We have been trying to troubleshoot the issue with Synopsys engineer in charge of the dwc2 linux driver, trying various DMA burst lengths, on/off waiting for outstanding AXI writes (bit 4 of GAHBCFG), so far no result. It looks like some timing issue to me...

lurch commented 3 years ago

Interesting - RPi4 1GB RAM works perfectly. RPi4 4GB RAM with the same SD card, the same DWC2 IP version and configuration fails (in DMA mode).

See the "Address map" section in https://datasheets.raspberrypi.org/bcm2711/bcm2711-peripherals.pdf - I believe that with only 1GB RAM the "legacy peripherals" in the VideoCore and the ARM see exactly the same regions of memory, but with more than 1GB of memory there's some parts of memory that the ARM can see but the legacy peripherals can't.

pelwell commented 3 years ago

Yes, but the kernel should be able to cope with that. An investigation is underway.

pavhofman commented 3 years ago

Thanks a lot, this is an important issue for the gadget feature. I am ready to do any tests/debugfs dumps if needed.

pelwell commented 3 years ago

OK - I understand the problem, and will put together a patch to try and fix it. Are you willing and able to build a kernel to test?

pavhofman commented 3 years ago

I have the rpi-5.12y branch ready for testing any patches, thanks.

pelwell commented 3 years ago

There should be something to test tomorrow. If compiles are slow you could start one now in preparation - the patch will only affect the dwc2 driver module.

pavhofman commented 3 years ago

Thanks, I will be happy to test. I have cross-compiled & tested a number of changes to the dwc2 and other modules on the latest raspbian64 recently.

pelwell commented 3 years ago

https://github.com/raspberrypi/linux/pull/4326 is a PR against the rpi-5.12.y branch (it applies cleanly to rpi-5.10.y as well) ready for testing. It works for me, and I think one can reason that if the original code was correct then the patch would be a no-op (apart from adding a field to a structure), so it should be safe.

pelwell commented 3 years ago

The bug causes the driver to use the wrong DMA direction when unmapping the buffer used to receive RNDIS packets. The reason it doesn't manifest on the 32-bit kernel is that the buffers allocated to receive the RNDIS packets coincidentally end up in the first 1GB of RAM, making the DMA map/unmap calls a no-op. This is not the case with the 64-bit kernel which appears to put the buffers higher in memory, meaning that the unmap call after a reception has to memcpy the data to the final destination, and misinterpreting the DMA direction skips that vital copy.

pavhofman commented 3 years ago

I tested the patch on 1GB and 4GB models, works correctly now on both. Thanks a lot for your help, this was a difficult bug to catch.

tatodorov commented 3 years ago

Just applied the patch to the latest rpi-5.12.y and tested it on Raspberry Pi 4 Model B Rev 1.4 (8 GB RAM). I am not facing this issue anymore.

@pelwell, thank you very much for your help!

popcornmix commented 3 years ago

The fix should also be in the rpi-update 5.10 kernel.

manhalt commented 3 years ago

Thank you for the fix, it works with rpi-update.

Unfortunatly it is not included / working with the latest kernel release (raspberrypi-kernel_1.20210527-1). I built a new arm64 image with pi-gen yesterday and it is still bricked. Is this to be released soon?