pyocd / pyOCD

Open source Python library for programming and debugging Arm Cortex-M microcontrollers
https://pyocd.io
Apache License 2.0
1.13k stars 484 forks source link

Implement HID for DAPLink/CMSIS-DAP v2 #876

Closed buzmeg closed 4 years ago

buzmeg commented 4 years ago

pyOCD currently limits v2 to bulk transfers. Bulk transfers are good for performance--if you are running USB Full Speed. For USB High Speed, I suspect that they are moot.

On Full Speed, frames are 1ms. Interrupt transfers are limited to 64 bytes every 1ms--that causes a significant bottleneck in terms of both latency and throughput. On Full Speed, bulk transfers can shovel multiple packets per 1ms frame. I'm unsure about latency, but I think it's still 1ms for turnaround (someone please correct me if I'm wrong).

On USB High Speed, as far as I can tell, this is completely different. Frames are 120us and interrupt transfers can put 3 packets per 120us frame. Consequently, interrupt transfers are 192Mbit/sec and can have latencies as low as 40us (at that point the SWD hardware lines are probably the bottleneck). In addition, interrupt packets are 1024 bytes while bulk packets are 512 bytes max.

This means that the extra grief of running bulk transfers (queuing multiple packets, installing special drivers, extra bulk probe code when HID works fine, and special access mechanisms) probably isn't worth it if the system is running USB High Speed as HID transfers would be just fine and all the "performance benefits" of running bulk evaporate.

Additionally, I suspect that SWO would be happier with HID as that would guarantee that it gets serviced in a timely fashion which bulk does not guarantee.

If I'm missing something, please point it out. I'm far from a USB expert, so it's quite possible (and not unusual :) ) for me to be very wrong.

I popped this to it's own bug to try to get some visibility. I realize that this is an enhancement, but I figure this is a good place to start. I expect if this requires actual DAPLink/CMSIS-DAP changes, then someone will point me at the appropriate person/place.

buzmeg commented 4 years ago

I'm going to close this out as I have hit a few quirks on the OS X side and I presume that they will pop up elsewhere.

HID drivers sometimes ignore packets that aren't exactly the size the HID descriptor. So, if you specify 64 bytes as the report size and don't actually send 64 bytes, then the HID driver layer may ignore the packet. This is more problematic if the packet size if 1024 bytes where you must always transfer 1024 bytes even if your request is only a couple bytes.

This was particularly a problem on OS X as you cannot get the HID driver to relinquish the device and that's where the packet drop seems to occur.

flit commented 4 years ago

CMSIS-DAPv2 is only defined for bulk. The sole difference between v1 and v2 is the switch from HID to bulk endpoints.

Fwiw, DAPLink builds by default will include both v1 and v2 interfaces, assuming the hardware supports enough endpoints. This allows the host to choose whichever version it has drivers for. (This default can be changed with custom builds.)

Another major reason to use bulk is that HID allocates fixed bandwidth from the USB controller. This limits the number of devices using HID that can be connected to a single controller. Not a problem with a single debug probe, but I've seen issues where people are trying to connect a large number of devices to a single system for testing wireless nodes and run into big trouble.

But I do agree HID is generally easy to manage from a driver perspective. The management of bulk transfers shouldn't be too much of a concern, though. It's just as difficult, or more so, to manage multiple outstanding CMSIS-DAP HID transfers (see the complexity of the pyDAPAccess layer in pyocd). This is pretty much required to get reasonable performance with CMSIS-DAPv1 on full speed. Also, the issues with 1024-byte packets are a real pain—I still cannot get LPCLink-II to work reliably with any Python HID driver under Windows.

buzmeg commented 4 years ago

At this point, I think I would rather see a defined USB CDC ACM driver (I think I got all the acronyms right) than custom bulk drivers.

ACM is a bulk transfer and is now driver installation free on Windows, OS X, and Linux, no? Is there still a good reason to use a custom bulk driver?

I'd also like to see an Ethernet profile for CMSIS-DAP. Then I could just use RNDIS or USB CDC ECM depending upon operating system. It would also let me use real Ethernet.

I understand that we always need cheap probes. But I'm more interested in being able to go UP the scale, really.

I have 64GB of microSD with 512MB of RAM and a 1GHZ ARM running Linux for about $50 (PocketBeagle's are about $25). I have a 200MHz RISC core (the PRU) on it dedicated solely to running the SWD interface. The signal integrity of the SWD link combined with the constant need to turn it around is my limiting factor.