dorssel / usbipd-win

Windows software for sharing locally connected USB devices to other machines, including Hyper-V guests and WSL 2.
GNU General Public License v3.0
3.79k stars 234 forks source link

[WSL2] Null bytes in virtual com port with a stlinkv2.1 #111

Closed bhav97 closed 2 years ago

bhav97 commented 3 years ago

Hi

I am running into a weird issue while using the stlink on my STM8 Nucleo-32 board. I have updated the stlink firmware to the latest available (V2J27M7) but the issue persists.

If I open the serial port on a windows application, everything works expected. After using usbipd to attach to wsl, I only read 0x0 from the serial port after an initial burst of data.

programming features of the stlink seem to be working as expected (using stm8flash). I have not used the mass storage device.

$ sudo tio /dev/ttyACM0
[tio 21:11:05] tio v1.32
[tio 21:11:05] Press ctrl-t q to quit
[tio 21:11:05] Connected
lloxxx
      Helloxxx
              Helloxxx
                      Helloxxx
                              Helloxxx
                                      Helloxxx
                                              Helloxxx
                                                      Helxx
                                                           Helloxxx
                                                                   Helloxxx
                                                                           Helloxxx
                                                                                   Heelloxxx
                                                                                            Helloxxx
                                                                                                    Helloxxx
                                                                                                            Helloxloxxx
                                                                                                                       Helloxxx
       Helloxxx
               Helloxxx
                       Helloxxx
                               Helloxxx
                                       Helloxxx
                                               Helloxxx
                                                       Helloxxx
                                                               Hello
[tio 21:11:08] Switched to hexadecimal mode
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[tio 21:11:09] Switched to normal mode

[tio 21:11:11] Disconnected

I don't see any exceptions in the usbipd log, and I'm not sure if it is a usbipd problem but I was hoping you could point me in the right direction usbipd.log

dorssel commented 3 years ago

Apart from the large burst of:

dbug: UsbIpServer.AttachedClient[1000] USBSUP_XFER_STALL -> EPIPE -> -32

log entries (which is a bit odd, but not wrong per se), I see nothing weird in the log. But tio may not be the best low-level serial port tool out there. In fact, from the man page tio is more for generic tty terminals than serial ports. I personally use minicom for low level serial communication.

bhav97 commented 3 years ago

minicom gives similar output

Welcome to minicom 2.7.1

OPTIONS: I18n
Compiled on Dec 23 2019, 02:06:26.
Port /dev/ttyACM0, 21:32:24

Press CTRL-A Z for help on special keys

�xx
   Helloxxx
           Hello

CTRL-A Z for help | 115200 8N1 | NOR | Minicom 2.7.1 | VT102 | Offline | ttyACM0                                        

And switching to hex output gives a bunch of zeros I do see some entries in dmesg (not sure if this is a problem)

[63003.141124] vhci_hcd vhci_hcd.0: pdev(0) rhport(0) sockfd(3)
[63003.141127] vhci_hcd vhci_hcd.0: devid(65538) speed(2) speed_str(full-speed)
[63003.141176] vhci_hcd vhci_hcd.0: Device attached
[63003.429083] vhci_hcd: vhci_device speed not set
[63003.498983] usb 1-1: new full-speed USB device number 20 using vhci_hcd
[63003.588881] vhci_hcd: vhci_device speed not set
[63003.658870] usb 1-1: SetAddress Request (20) to port 0
[63003.718455] usb 1-1: New USB device found, idVendor=0483, idProduct=374b, bcdDevice= 1.00
[63003.718458] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[63003.718460] usb 1-1: Product: STM32 STLink
[63003.718461] usb 1-1: Manufacturer: STMicroelectronics
[63003.718462] usb 1-1: SerialNumber: 0671FF393731435243151045
[63003.779680] cdc_acm 1-1:1.2: ttyACM0: USB ACM device
[63134.402834] vhci_hcd: unlink->seqnum 274634
[63134.402838] vhci_hcd: urb->status -104
[63134.404555] vhci_hcd: unlink->seqnum 274636
[63134.404557] vhci_hcd: urb->status -104
[63186.528971] vhci_hcd: unlink->seqnum 279864
[63186.528973] vhci_hcd: urb->status -104
[63186.529739] vhci_hcd: unlink->seqnum 279865
[63186.529742] vhci_hcd: urb->status -104
[63186.530487] vhci_hcd: unlink->seqnum 279866
[63186.530489] vhci_hcd: the urb (seqnum 279866) was already given back
[63186.530490] vhci_hcd: unlink->seqnum 279867
[63186.530490] vhci_hcd: the urb (seqnum 279867) was already given back
[63186.530501] vhci_hcd: unlink->seqnum 279868
[63186.530501] vhci_hcd: the urb (seqnum 279868) was already given back
[63314.592310] vhci_hcd: connection closed
[63314.592450] vhci_hcd: stop threads
[63314.592454] vhci_hcd: release socket
[63314.592458] vhci_hcd: disconnect device
[63314.592509] usb 1-1: USB disconnect, device number 20
dorssel commented 3 years ago

Given the USBSUP_XFER_STALL -> EPIPE entries in the log, those messages (urb->status -104 and the urb (seqnum XXX) was already given back) are perfectly normal. They are Linux debug messages, not warnings or errors.

The question really is, why does the pipe stall so often in the first place? I cannot explain that.

The first few stalls are normal, for endpoint 0, during device setup. But the real communication is on endpoint 5. And the first weird thing is pending count = 17. That is a lot of pending requests. URB requests usually overlap (it is essentially multiple outstanding reads for performance), but usual numbers for pending count are in the order of 4 or 5, not 17. Then, at some point, those all get cancelled twice (!) Aborting endpoint 132 and Aborting endpoint 133, after which a burst of USBSUP_XFER_STALL is returned. I guess this is what leads to the NUL bytes you seem to receive.

Unfortunately, I have no further clues on what is going wrong here...

dorssel commented 2 years ago

@Bhav97 Could you please try PR #224, the installer is at https://github.com/dorssel/usbipd-win/actions/runs/1734739747. This may very well solve the problem.

bhav97 commented 2 years ago

Thanks for the update! Unfortunately it didn't work. I still have pending count = 16 in my logs and null bytes in the the serial log

usbip.log

dorssel commented 2 years ago

Thanks for testing. I'll keep trying...

bhav97 commented 2 years ago

Hi! There was an error from my side. While installing there was a conflict due to powertoys awake and I think the installation wasn't completed properly. I uninstalled and reinstalled and got some logs again.

No difference but here's the updated log file. Let me know if you need anything else usbip.log

dorssel commented 2 years ago

@bhav97 Another issue that may be related to yours was fixed in the current master build. Could you give https://github.com/dorssel/usbipd-win/actions/runs/1991685584 a try?

bhav97 commented 2 years ago

Hi @dorssel , thanks for the update I tried the 2d175 build, but I still receive the NULL bytes after the initial burst of data.

I'll collect some tshark logs as well

usbipd.log

Here's a tshark capture of attach, read data, detach usbipd.zip

dorssel commented 2 years ago

@bhav97 These last tshark captures are interesting.

1) At the start of the communication (endpoint = 5), there is 1 'x' character incoming, and 1 'x' character outgoing. This is the only outgoing character in the entire capture (packets 90-94) 2) Then a burst of "Helloxxx\a" bunched up in full packets of 32 bytes (packets 95-140). The last packet gets truncated at the 16-byte mark. 3) Then it is followed by the NUL bytes. Sometimes 1, sometimes up to 4 in a packet (packets 142-922). 4) Then some sort of a "line reset" with a new line coding request (packet 924-926) 5) Then a single interrupt on endpoint 4 (packet 927) 6) And finally from packet 963 the same message "Helloxxx\a", but this time 1 byte per packet. And this never ends, and there are no NUL characters.

To me this looks like: it is a custom firmware, where the device is supposed to spit out "Helloxxx\a" repeatedly. The packets quickly return 32 characters at a time, but the last one ends after 22 packets with a truncated packet at the 16-byte mark. Then NULs. Looks a lot like "buffer overrun" in the buffered serial output. Then a line state switch, after which the device returns the same message (apparently unbuffered this time, 1 character per read). And that goes fine.

I'm sorry, but I don't see anything wrong in the USBIP protocol, nor in the USB request, nor in usbipd-win handling those. Are you sure the device is programmed correctly? What happens on a bare-metal Linux?

bhav97 commented 2 years ago

Hi @dorssel , You're right the firmware simply prints out Helloxxx over UART. The onboard STlink serves the UART data over the VCOM.

It works on Ubuntu20.04 and Win11.

  1. On WSL there seems to be some sort of an echo (I sometimes see multiple outgoing characters, one at a time) vs nothing like that on windows
  2. On WSL there are USB frames with multiple characters of the string whereas on windows the data always is read one character at a time.

I'm not sure if this behavior is because of different handling of bulk requests on these two platforms so I'll try collecting tshark logs on Ubuntu as well for a more appropriate comparison.

Btw I use tio/minicom on WSL/Ubuntu and PuTTY on Windows. Here's the capture from the Windows host wincap.zip

dorssel commented 2 years ago

@bhav97 I think it may be timing. On native Windows, or bare Linux, the device driver is loaded as soon as the device boots. So the virtual COM port is polled immediately, and starts picking up the characters one-by-one immediately.

On WSL, the device is first booted, and then it takes a while before the attach happens and the Linux tty driver starts reading. In this time, it looks like the device is filling a buffer internally, because the USB requests haven't started yet. When the attach and Linux driver are finally loaded, the first USB requests start draining the buffer, returning 32 characters at a time.

Your WSL capture did show that after a while, also on WSL the characters come in 1 at a time, without the NULs. Can you confirm this?

My guess: on WSL the device needs to buffer the period from USB power up to 'attach + Linux driver load'. That goes somewhat OK (32 characters at a time for a while) but end in NULs because the device has overflowed its buffers. You don't see this on native Windows/bare Linux, because there is hardly any delay between power up and driver load.

dorssel commented 2 years ago

Root cause seems to be user firmware. Closing issue.