DeqingSun / ch55xduino

An Arduino-like programming API for the CH55X
GNU Lesser General Public License v2.1
433 stars 85 forks source link

CMSIS_DAP unreliable #30

Closed nerdralph closed 2 years ago

nerdralph commented 3 years ago

With a CH552 running at 16MHz and 3V3 supply, I was getting ACK failures using a STM32F0 target. When I attached my scope probes to the SWDCLK & SWDIO lines, it would work intermittently. When I looked at the capture, I could see the clock phase is wrong. It appears the code was copied from the DAPLink template, which itself is wrong. https://github.com/ARMmbed/DAPLink/issues/764

I fixed the SW_ macros in SW_DP.c, but it seems more bugs as I'm getting a transfer fault and the SWDIO is being left low instead of high.

DeqingSun commented 3 years ago

Thanks for reporting. That is good to know. Will investigate it.

Did you try 5V? The CH552 is configured in traditional OC mode. The current should not be able to destroy the 3.3V target even it is not 5V tolerance. (I know this is not a hacky way)

nerdralph commented 3 years ago

Yes, 5V/24MHz worked reasonably well with STM32F0, but not with HK32F0. As I mentioned in the DAPLink issue, the problem is the clock phase. The fact that it works with the STM32 just means it is more tolerant of SWD timings that are out of spec.

On Thu, Jan 7, 2021, 20:48 Deqing Sun notifications@github.com wrote:

Thanks for reporting. That is good to know. Will investigate it.

Did you try 5V? The CH552 is configured in traditional OC mode. The current should not be able to destroy the 3.3V target even it is not 5V tolerance. (I know this is not a hacky way)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/DeqingSun/ch55xduino/issues/30#issuecomment-756476024, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKNZ6TQ4TSJURB4RCO4EU3SYZI5BANCNFSM4VZHOUJA .

DeqingSun commented 3 years ago

Unfortunately I don't have STM32F0 or HK32F0. So I'm not able to test it now. However you mentioned 5V/24MHz works while 16MHz and 3V3 not working. I guess the rising speed of the signal might be an issue. If your CH552 and target are both on 3.3V, you may try to uncomment the P3_MOD_OC & and commend the P3_MOD_OC | to use Push-Pull output. That will help if it is a rising speed issue.

https://github.com/DeqingSun/ch55xduino/blob/f1eec880114039004b4686a06ac3fad402d9ae06/ch55xduino/ch55x/libraries/Generic_Examples/examples/05.USB/CMSIS_DAP/src/CMSIS_DAPusb/SW_DP.c#L49

nerdralph commented 3 years ago

Rise time is plenty fast at around 3-4ns. The default port drive mode drives the signal high for 2 cycles then switches to pullup. I confirmed with my scope. I think I'll rewrite the code to fix the clock phase and improve the speed as well, while moving it to a standalone sdcc project.

I'll update this issue with a link to the code in case you want to port it back to ch55xduino.

On Fri, Jan 8, 2021, 02:32 Deqing Sun notifications@github.com wrote:

Unfortunately I don't have STM32F0 or HK32F0. So I'm not able to test it now. However you mentioned 5V/24MHz works while 16MHz and 3V3 not working. I guess the rising speed of the signal might be an issue. If your CH552 and target are both on 3.3V, you may try to uncomment the P3_MOD_OC & and commend the P3_MOD_OC | to use Push-Pull output. That will help if it is a rising speed issue.

https://github.com/DeqingSun/ch55xduino/blob/f1eec880114039004b4686a06ac3fad402d9ae06/ch55xduino/ch55x/libraries/Generic_Examples/examples/05.USB/CMSIS_DAP/src/CMSIS_DAPusb/SW_DP.c#L49

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/DeqingSun/ch55xduino/issues/30#issuecomment-756577346, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKNZ6R3WR7Q56UPTVKYKILSY2RHPANCNFSM4VZHOUJA .

DeqingSun commented 3 years ago

Porting the code to standard SDCC should be straightforward. As the CMSIS-DAP part has no dependency on Arduino lib.

I saw a person talked about how he did high speed cmsis-dap on CH558 Chinese link. Although he hasn't release the code, he talked about how he did it.

The CMSIS-DAP V2 avoids the 1ms delay in HID packets.

The 8-bit command and 32-bit data transfer can be optimized with SPI.

In order to accelerate USB speed, the DAP_PACKET_COUNT is set to 4, use USB DMA in single buffer mode, change USB buffer address in ISR, and use both DPTR and DPTR1 to move the data from usb buffer to memory.

nerdralph commented 3 years ago

It turns out part of my problem was some flaky dupont cables. To ensure proper compliance with the SWD physical layer spec, I've gone ahead and made the clock polarity fixes as well as changed SWCLK to push-pull mode. It's working reliably, and I have proceeded to cleaning up some of the code. There were a lot of warnings like:

left shifting more than size of object changed to zero

With the MCU running at 16Mhz, the current bit-bang code tops out at around 200kbps. I've started rewriting it in 8051 asm, which should increase the speed to ~1mbps. SPI could make it faster, but would still have to be mixed with bit-bang code for the ACK and for the parity + trail bits at the end of the data. The maximum flash write speed on many STM32 devices is ~40kB/s, and I think ~30kB/s should be possible even with DAP_PACKET_COUNT at 1.

DeqingSun commented 3 years ago

@nerdralph Great to hear that. If you are willing to share the code I'll see if I can put it back to CH55xduino.

nerdralph commented 3 years ago

I just pushed the first working snapshot: https://github.com/nerdralph/ch554_sdcc/tree/master/examples/CMSIS_DAP

There's still more of the code I want to streamline, and I have to finish the asm bit-bang code.

nerdralph commented 3 years ago

Have you done any testing with the CH552 running at 3V3? I've been encountering some strange problems where SWD works fine on P3.1 & P3.2, but gets an invalid ACK value (5) when using P1.7 & P1.6. I've tried two different CH552T chips - one with bootloader v2.31 & a newer one with bootloader 2.40. I want to make P1.7 & P1.6 the default pins so the firmware could be changed in the future to use hardware SPI for some of the communications.

DeqingSun commented 3 years ago

Have you done any testing with the CH552 running at 3V3? I've been encountering some strange problems where SWD works fine on P3.1 & P3.2, but gets an invalid ACK value (5) when using P1.7 & P1.6. I've tried two different CH552T chips - one with bootloader v2.31 & a newer one with bootloader 2.40. I want to make P1.7 & P1.6 the default pins so the firmware could be changed in the future to use hardware SPI for some of the communications.

Sorry I have no idea. I only tested the 3.3V CH552 can blink an LED. If you have more info I can help you to post it on the official forum and the staff may have an answer.

nerdralph commented 3 years ago

Have you done any testing with the CH552 running at 3V3? I've been encountering some strange problems where SWD works fine on P3.1 & P3.2, but gets an invalid ACK value (5) when using P1.7 & P1.6. I've tried two different CH552T chips - one with bootloader v2.31 & a newer one with bootloader 2.40. I want to make P1.7 & P1.6 the default pins so the firmware could be changed in the future to use hardware SPI for some of the communications.

Sorry I have no idea. I only tested the 3.3V CH552 can blink an LED. If you have more info I can help you to post it on the official forum and the staff may have an answer.

I'm getting close to figuring it out. I've suspected the SPI interface may be causing a conflict, but according to the datasheet, after reset, SPI0 should be disabled. Just today I tried setting SPI0_SETUP and SPI0_CTRL, and SWD on P1.7/P1.6 started working with my STM32F0. However it still doesn't work with my HK32F030M unless I switch to P3.1 & P3.2.

nerdralph commented 3 years ago

I finally figured it out. Reflections/ringing was randomly causing some clock signals to be counted as 2 rising edges by the target. The edges were so fast that I was only able to see the extent of the overshoot with my scope on 500Msps & 1Gsps mode. After adding 68Ohm series termination resistors all targets (STM32F0 & F1, GD32E, & HK32F) are working fine. 28AWG Dupont jumper wires probably have around 200Ohm impedance, so ideal series resistors would probably be >=150Ohm, but 68Ohm is what I tried first and it works. I may also try switching back to default output mode instead of push-pull, to see if I can make it work running at 5V with a target without 5V tolerant IO.

DeqingSun commented 3 years ago

@nerdralph Great you figured out. I checked the initial value of SPI0_SETUP and SPI0_CTRL are both 0. Is it necessary to be changed to another value?

I think the default output mode is not strong at all. If there is ESD diode on the target it should be fine to connect to 3.3V target. At least I tried Arduino Zero and it worked.

nerdralph commented 3 years ago

The SPI0 registers don't need to be changed from the default of 0 and 2. It was just a random fluke that it worked when I had tried that.

The default output mode is full push-pull for 2 cycles on a 0-to1 transition, then a pullup around 10k. See figure 10.2.1 in the datasheet. I've also confirmed it on my scope. When I use the default mode for SWCLK, with 5V power, the line goes to 5.2V for ~120ns, and then settles to ~4.6V. This is connected to a STM32F030, which has a pulldown of ~40kOhm on SWCLK. The SWD pins on the STM32F030 and GD32E230 are 5V tolerant, and spec'd to go to Vdd + 4V.

I don't think any of the HK32F030M GPIO are 5v tolerant, so with the target Vdd at 3V3, and a 0.6Vf for the ESD clamping diode, when the line is driven to 5V, significant current can flow to Vdd through the GPIO. I think the CH55x output drivers have an impedance of ~20Ohm, so with a 68Ohm series resistor there would be about 1.1V/90Ohm or 12-13mA.

If you are just using 15-20cm jumper wires to connect to the ARM target, I bet you'll see a lot of ringing if you hook up a scope to the SWD lines. Since the CH55x parts don't have slew rate control, I'd suggest series impedance resistors on any SPI-like communication lines. It's not a concern with UART communications since the bits get sampled in the middle of the bit after any ringing has dampened. Since SPI is edge triggered, if the ringing amplitude exceeds Vdd/2, it can bring the level below Vil, and cause 1 rising edge to be detected as 2.

DeqingSun commented 3 years ago

@nerdralph Thanks for mentioning the 2 cycles for transition. I never thought it carefully or observe with scope. And I'll add the info about the ringing into the comment of the sketch.

DeqingSun commented 3 years ago

@nerdralph The comment has been updated. If push-pull for 2 cycles is an issue, maybe just use OC mode with Pn_DIR_PU set to 0. And use external pull-up to 3.3V.

nerdralph commented 3 years ago

That's worth considering. I still have to test with higher value series resistors too.

On Mon, Jan 25, 2021, 21:21 Deqing Sun notifications@github.com wrote:

@nerdralph https://github.com/nerdralph The comment has been updated. If push-pull for 2 cycles is an issue, maybe just use OC mode with Pn_DIR_PU set to 0. And use external pull-up to 3.3V.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DeqingSun/ch55xduino/issues/30#issuecomment-767218038, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKNZ6T2C3BQLQZT4I7S3N3S3YKI7ANCNFSM4VZHOUJA .

nerdralph commented 3 years ago

The HK32F seems the most sensitive to signal quality issues, even though it is likely in full compliance with the ARM SWD-DP spec. The spec doesn't include any timing, and the pysical interface specs are very loose. DPv0 (0031A) didn't even specify the physical interface. DPv1 (0031C) includes the requirement for a ~100k pullup on SWDIO. It's only 0031E that clearly set out that the target clocks data on the rising edge of SWDCLK (B4.3.1). https://static.docs.arm.com/ihi0031/e/debug_interface_v5_2_architecture_specification_IHI0031E.pdf

Since there is no minimum interval specified for the SWDIO high and low cycles, a target can accept a 2ns clock cycle and still be compliant. Even worse, since the spec doesn't set any levels for Vil and Vih, the target does not require any hysteresis on the the SWD port. When the output impedance from the CH552 is too high (>1kOhm), I've seen coupling between the SWDIO and SWCLK lines cause a low blip in the middle of the rising SWCLK signal as the target HK32F0 drives SWDIO low to clock out data to the host. This low blip is read as another clock cycle, so 2 bits get clocked out by the target in one clock cycle.

I suppose something more sophisticated like clamping the SWD lines with zener diodes would work, but I have no more ideas for a simpler solution than using a 3V3 supply along with low value (50-150Ohm) series resistors. So now my priority will be some more code cleanup and optimization.

nerdralph commented 3 years ago

My version has been stable using 68Ohm series resistors. I've tested with several targets, and get about 6kB/s. There's more optimization I could do for performance, however what I plan to do first is add USB CDC to make it a more full-featured debug probe.

DeqingSun commented 3 years ago

@nerdralph Great that the series resistors works. Would you mind sharing a photo or a drawing about the termination resistors? It will help people with similar issue to get it work. Getting CDC together with the CMSIS-DAP would be take some time but straight forward. Just merge CDCinUserCode into CMSIS_DAP. Change the endpoints in CDC, descriptor and increase the --xram-loc.

nerdralph commented 3 years ago

It's just a resistor in series between the CH552 pin and the target SWD pin. For my testing I soldered female dupont jumper wires on the ends of resistors. When I'm done the firmware, I think I'll make a debug probe schematic + PCB.

For the CDC functionality, I think most of the work will be combining the descriptors. I already did some cleanup on the usb_device_cdc example code, and when I add the CDC endpoint, I plan to use the config descriptor struct from ch554_usb.h instead of a uint8_t array. https://github.com/nerdralph/ch554_sdcc/blob/master/examples/usb_device_cdc/main.c

nerdralph commented 3 years ago

I just noticed the response length from DAP_Thread was being ignored, so UEP1_T_LEN was always being set to 64. Fixing that increased the flash write speed from 8.9kB/s to 10.4kB/s on Linux with pyOCD. https://github.com/nerdralph/ch554_sdcc/blob/master/examples/CMSIS_DAP/main.c#L37 pyocd flash buck50.hex -e chip -t stm32f103c8

DeqingSun commented 3 years ago

There was a reason I don't quite remember. Maybe it causes some issue on Mac or Windows if the UEP1_T_LEN is not 64. Or it makes no difference whether using 64 or response_len.

nerdralph commented 3 years ago

There was a reason I don't quite remember. Maybe it causes some issue on Mac or Windows if the UEP1_T_LEN is not 64. Or it makes no difference whether using 64 or response_len.

I just tried Win7, and pyOCD hangs with UEP1_T_LEN != 64. :-(

nerdralph commented 3 years ago

Chris Reed at ARM recommend installing hidapi on Windows for pyOCD. In my testing that makes the flash write performance on Windows about equal to Linux. Today I added some code to use the 8051 hardware parity https://github.com/nerdralph/ch554_sdcc/blob/master/examples/CMSIS_DAP/SW_DP.c#L205 Now I'm getting 10.8kB/sec flashing a STM32F103.

nerdralph commented 3 years ago

It now works with DAP_PACKET_COUNT=2. https://github.com/nerdralph/ch554_sdcc/tree/master/examples/CMSIS_DAP

pyOCD reports a flash write speed of 13kB/s, though it's closer to 7kB/s when you factor the startup time of the python interpreter and pyOCD's initialization. openocd was about 20% faster than pyOCD for my tests.

nerdralph commented 3 years ago

I've simplified the HID report descriptor. It works fine on Win7 & Linux. Could you try it on Win10? https://github.com/nerdralph/ch554_sdcc/blob/master/examples/CMSIS_DAP/USBconstant.c#L73

DeqingSun commented 3 years ago

Sorry I don't have win10

nerdralph commented 3 years ago

I used the USB-IF HID descriptor parsing tool to ensure compatibility. https://github.com/pyocd/pyOCD/discussions/1088

I also finished cleaning up the configuration descriptor, and used structures instead of an unstructured array. https://github.com/nerdralph/ch554_sdcc/blob/master/examples/CMSIS_DAP/USBconstant.c#L75

I'm going to cleanup the usb_device_cdc example configuration descriptor as well, and then write the composite HID + CDC/ACM descriptor.

DeqingSun commented 3 years ago

Thanks for debugging it. I've never tested it with pyocd in Windows. I forgot where I copied this Descriptor.

https://github.com/nerdralph/ch554_sdcc/blob/master/examples/CMSIS_DAP/USBconstant.c#L136

It seems you get rid of the Usage Minimum/Usage Maximum, I'm not quite sure what it is for. It seems reasonable to me.

But you set the Logical maximum to 64 which is a bit confusing. I'm not saying you are wrong because I don't really understand it now. For my previous understanding the Logical maximum should be the largest possible value in the report. I do see some people in https://www.microchip.com/forums/m955579.aspx has the similar question.

https://github.com/hathach/tinyusb/blob/master/src/class/hid/hid_device.h#L344 and https://github.com/NicoHood/HID/blob/master/src/SingleReport/RawHID.cpp#L26 both use 0xFF for Logical maximum. Since you already has a setup that can trigger the issue, would you mind testing it with 0xFF?

nerdralph commented 3 years ago

But you set the Logical maximum to 64 which is a bit confusing. I'm not saying you are wrong because I don't really understand it now. For my previous understanding the Logical maximum should be the largest possible value in the report. I do see some people in https://www.microchip.com/forums/m955579.aspx has the similar question.

I think you're right. I'll put it back to 255. It worked with that value before, and when it worked with 64 I left it that way. " The Logical Minimum and Logical Maximum values identify the range of values that will be found in a report. If Logical Minimum and Logical Maximum are both positive values then a sign bit is unnecessary in the report field and the contents of a field can be assumed to be an unsigned value." https://www.usb.org/sites/default/files/hid1_11.pdf

DeqingSun commented 3 years ago

https://github.com/nerdralph/ch554_sdcc/blob/fbae536c0ffbfde9ce6e22eedc7c125a35c3a8cd/examples/CMSIS_DAP/USBconstant.c#L141

The Logical Minimum is 0x25, is it a typo?

nerdralph commented 3 years ago

https://github.com/nerdralph/ch554_sdcc/blob/fbae536c0ffbfde9ce6e22eedc7c125a35c3a8cd/examples/CMSIS_DAP/USBconstant.c#L141

The Logical Minimum is 0x25, is it a typo?

Good catch. Fixed. If ch554_usb.h had defines for HID report descriptors I could use, it would be harder to make this kind of mistake.