dirtyjtag / DirtyJTAG

JTAG probe firmware
MIT License
464 stars 70 forks source link

DJTAG2 protocol discussion #77

Closed jeanthom closed 1 year ago

jeanthom commented 3 years ago

In a discussion with @phdussud he suggested an evolution of the current USB protocol to allow for better performance. This GitHub issue is here to discuss how we should implement the next iteration of the DirtyJTAG USB protocol.

The current protocol is nicknamed "DJTAG1". You can actually query this with CMD_INFO (source code).

We should ideally make it retrocompatible by making DJTAG2 a superset of DJTAG1. Newer software should CMD_INFO to query whether or not they can use DJTAG2 commands or if they're stuck with DJTAG1.

Proposal 1: increase maximum transfer size

@phdussud wrote in #76:

One last thing is that I believe the biggest improvement that can be made after this is increasing the packet length from 32 to 64. This requires changing the cmd structure (2 bytes for the length), therefore clients will have to change. This is the reason I didn't do it.

Personnal comment: USB bulk transaction for full-speed devices (like DirtyJTAG) are limited to 64 bytes. For a transfer we would need:

That would push the maximum transfer size to 61 bytes.

Proposal 2: specify if transfers should return data

@phdussud wrote in #76:

Another simple substantial improvement: we can introduce a transfer that does not send back a USB packet. Most of the programming involves only a write operation to the device. I implemented it and my timing improves quite a bit. With the best clock rate (SPI) it takes 253us for each 30byte of transfer. Only 53us is actually spend on the Jtag bus. The rest is USB overhead. If I implement the transfer with no read, the time goes down to 172us. In my example, the fastest wall clock time for the end to end programming is 2.9sec with normal transfers and 2s with no read transfer This is with OpenFPGALoader which already knows when it does not need to read back.

Proposal 3: add bootloader support

Using DirtyJTAG with a bootloader is currently very limited. Adding a command for jumping to bootloader would allow seamless upgrades. For dapboot we have to set magic values into the RTC backup registers before resetting the MCU.

phdussud commented 3 years ago

For Proposal 1 and 2: May I propose something a little different: We use a full byte for the command ID and we have very few commands. As a result, the upper 4 bits of the command byte is always 0. I propose that we only support 16 commands coded in the lower 4 bits and use the upper 4 bits as a command modifiers. This way we don't have to duplicate code when the commands are very close like the ones we have been discussing in proposal 1 and 2. For the transfer command, I propose that the bit 0x10 means no read is necessary and 0x20 means the transfer is 32 bytes longer than the length coded in the transfer length.

phdussud commented 3 years ago

Sorry I got confused between bytes and bits for a moment. For the transfer command, I propose that the bit 0x10 means no read is necessary and 0x20 means the transfer is 256 bits longer than the length coded in the transfer length.

jeanthom commented 3 years ago

That sounds better. So we'd have something like this:

Internally (in DirtyJTAG firmware) everything would be implemented as a single transfer function in jtag.c and the command parsing code in cmd.c would take care of calling jtag_transfer() with the right parameters.

Does that sound alright to you?

phdussud commented 3 years ago

Exactly! Now that I think more about it, I would like NO_READ to be 0x80 and EXTEND_LENGTH to be 0x40. This way are not not locked in into 16 commands max, we can still use 64 commands. We can make the decision later to use yet other bits in the high 4 bits or leave them for more commands.

jeanthom commented 3 years ago

Yep that would be better for later.

Also I remembered that ST's genuine ST-Link adapter (the "white pod" one) has configurable I/O levels, we should add a command to take care of that.

jeanthom commented 3 years ago

Undergoing work on DJTAG2 is happening in the #78 PR.

phdussud commented 3 years ago

I have an implementation of the new protocol in another branch of my fork. However the EXTEND_LENGTH seems to have a problem. I also implemented the required code in OpenFPGALoader and when I pass larger packets I get a CRC failure at the end of programming. I would appreciate some eye balls on the larger packet changes. The rest works (no read, SPI support and no timer loops). See https://github.com/phdussud/DirtyJTAG/tree/spi Thank you very much! Patrick

phdussud commented 3 years ago

@jeanthom Sorry I didn't notice your work on most of the new commands attribute. I will pick up your code and integrate in my branch, as the code changes I have are largely about making the jtag_transfer command go faster.

phdussud commented 3 years ago

OK. I integrated jeanthom changes in my spi branch. I would prefer that we don't send back 64 bytes all the time in jtag_transfer, as it is more expensive.

phdussud commented 3 years ago

The issue with EXTEND_LENGTH and openFPGALoader was my fault, on the openFPGALoader side. It is now fixed and my tests pass. @jeanthom It seems to me that you probably want to tag the latest commit as being the last of version 1. Unless you want to create a branch for version 1 but fixes. My changes (in https://github.com/phdussud/DirtyJTAG/tree/spi ) will be for version 2. because it requires a change in the pinout for bluepill board to be the maximum performance. Thoughts?

phdussud commented 3 years ago

A note about performance: Before my first PR openFPGALoader and bluepill was taking 13-14s to load a bit file into a ulx3s board. With https://github.com/phdussud/DirtyJTAG/tree/spi it now takes 1.3s (10x faster). This is with changes to openFPGALoader to take advantage of the version 2. For reference, a FT232H will load the same bit file in 0.9s

phdussud commented 3 years ago

@jeanthom About CMD_SETSIG/CMD_GETSIG mentioned in #88. I didn't suggest changing or removing CMD_SETSIG/CMD_GETSIG. I was pointing out that the JTAG interface will ignore TMS, TDI pins levels unless they are sampled by a rising TCK. This lead me to think that accomplishing this with CMD_SETSIG isn't the most performant and does not lend itself to a good timing of the signal transitions with respect to TCK edges. It is even slower if we need to capture the state of TDO during the rising edge of TCK. I think that the idea of CMD_CLK with a modifier to capture TDO is an excellent idea. I would suggest that this modifier should be allowed only when the clock count is 1.

jeanthom commented 3 years ago

@phdussud What is your opinion on https://github.com/jeanthom/DirtyJTAG/commit/eb80bec9404f4927dab604aeeac783d3b53eecee?

phdussud commented 3 years ago

I think this is a good idea but it needs to be implemented in jtag_strobe because the TDO needs to be captured on the (last) rising edge of the TCK signal, not after the signal goes down. It is worth noting that the setsig/getsig/setsig way of doing the same thing assumes that the THOLD of TDO is infinite or super large. This is generally the case but it isn't guaranteed by the JTAG spec as far as I know. Have a good weekend, Patrick


From: Jean THOMAS @.> Sent: Sunday, April 4, 2021 8:02 AM To: jeanthom/DirtyJTAG @.> Cc: phdussud @.>; Mention @.> Subject: Re: [jeanthom/DirtyJTAG] DJTAG2 protocol discussion (#77)

@phdussudhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fphdussud&data=04%7C01%7C%7Cddc10b6077974613da7e08d8f77aa198%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637531453354028250%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=mgY%2B%2BsOJQSb8ngppZTLAoX1Y9p6O%2F3Upm%2BTVPSc29iU%3D&reserved=0 What is your opinion on eb80bechttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjeanthom%2FDirtyJTAG%2Fcommit%2Feb80bec9404f4927dab604aeeac783d3b53eecee&data=04%7C01%7C%7Cddc10b6077974613da7e08d8f77aa198%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637531453354038202%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=6ZugL7LJ7uwPMfI6a7tCJvVRQo5c0As%2BthPzEieC1vs%3D&reserved=0?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjeanthom%2FDirtyJTAG%2Fissues%2F77%23issuecomment-813047552&data=04%7C01%7C%7Cddc10b6077974613da7e08d8f77aa198%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637531453354038202%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=iTpyYMmlZS3rTwYRZdANLPxSPJCRvfjbwV7sL4VbN%2Bk%3D&reserved=0, or unsubscribehttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABVQKX2Q6R7M5HQUAOLVFQTTHB5PLANCNFSM4V6H5ZFQ&data=04%7C01%7C%7Cddc10b6077974613da7e08d8f77aa198%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637531453354048158%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=d51abI2DHVzTH3rzuUitP4HRcjfl%2FW20cT3xMTOxdZQ%3D&reserved=0.

jeanthom commented 3 years ago

I tried to fix this in https://github.com/jeanthom/DirtyJTAG/commit/70601274baa222aed99fed4327eee7327b826c7c

I wanted to avoid doing lots of checks inside the loop but I'm not sure that's the right way to do it. What do you think?

phdussud commented 3 years ago

I think this looks fine! Personally I wouldn't have considered the case of 0 clocks since per JTAG spec, the value of TDO shouldn't be sampled when the clock isn't on the rising edge. But I don't know what URJtag can do so it may be able to use this case. Thanks, Patrick


From: Jean THOMAS @.> Sent: Sunday, April 4, 2021 10:43 AM To: jeanthom/DirtyJTAG @.> Cc: phdussud @.>; Mention @.> Subject: Re: [jeanthom/DirtyJTAG] DJTAG2 protocol discussion (#77)

I tried to fix this in 7060127https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjeanthom%2FDirtyJTAG%2Fcommit%2F70601274baa222aed99fed4327eee7327b826c7c&data=04%7C01%7C%7Cc4e22189dfb04219808f08d8f79126b2%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637531550076369185%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=ocpABumw1kuaAOsjRd1pQwqbxtS37uf0Dh2BTXxRY%2FI%3D&reserved=0

I wanted to avoid doing lots of checks inside the loop but I'm not sure that's the right way to do it. What do you think?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjeanthom%2FDirtyJTAG%2Fissues%2F77%23issuecomment-813071524&data=04%7C01%7C%7Cc4e22189dfb04219808f08d8f79126b2%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637531550076369185%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=s8vjvG5KDuhEPbJWFCYYVhGcVI5iGfD6FGTeGJbQqmI%3D&reserved=0, or unsubscribehttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABVQKX3U4475EDIEQEESABLTHCQL3ANCNFSM4V6H5ZFQ&data=04%7C01%7C%7Cc4e22189dfb04219808f08d8f79126b2%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637531550076379145%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=d9chZJceaZw7DqbzP7WGNw8YNOcEfY2A956NgC1JSEU%3D&reserved=0.

jeanthom commented 3 years ago

CMD_GOTOBOOTLOADER was tested OK on an Olimex STM32-H103 with dapboot as its bootloader.

phdussud commented 3 years ago

Great!


From: Jean THOMAS @.> Sent: Monday, April 5, 2021 4:11 AM To: jeanthom/DirtyJTAG @.> Cc: phdussud @.>; Mention @.> Subject: Re: [jeanthom/DirtyJTAG] DJTAG2 protocol discussion (#77)

CMD_GOTOBOOTLOADER was tested OK on an Olimex STM32-H103 with dapboot as its bootloader.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjeanthom%2FDirtyJTAG%2Fissues%2F77%23issuecomment-813342724&data=04%7C01%7C%7Cf90c3ec921a94bf70f6b08d8f8238aca%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637532178816637685%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=SehwJeM9nFa9IQRZ14yGalxIylWkas7Yhr6pLTB9Bv4%3D&reserved=0, or unsubscribehttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABVQKX3Z4L46MP6VGGUKQP3THGLFRANCNFSM4V6H5ZFQ&data=04%7C01%7C%7Cf90c3ec921a94bf70f6b08d8f8238aca%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637532178816637685%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=7Z1yoASMsAsEHjUE1w5oJaSNplLNKwWUeaHy%2FqMSyMw%3D&reserved=0.

phdussud commented 3 years ago

Do you have access to a Raspberry Pico board? If yes, let me know... I have a preliminary implementation of DJtag2 on it. It needs testing with other software than OpenFPGALoader.

phdussud commented 3 years ago

Another question... Why stop cmd processing when the command produces an output? If we didn't we could send a cmd request like CMD_SETSIG, CMD_SETSIG, CMD_GETSIG, CMD_SETSIG and it would be able to efficiently set values, generate a rising clock get TDO, and generate a falling clock. I may be missing something.

zoobab commented 3 years ago

I do have a picoboard, if you want it, shoot me an email at zoobab at Gmail.

jeanthom commented 3 years ago

Another question... Why stop cmd processing when the command produces an output? If we didn't we could send a cmd request like CMD_SETSIG, CMD_SETSIG, CMD_GETSIG, CMD_SETSIG and it would be able to efficiently set values, generate a rising clock get TDO, and generate a falling clock. I may be missing something.

I think this design choice finds its roots in my extreme laziness 😄 I wanted to avoid keeping track of which commands I sent were supposed to return data. With that said, I'm not exactly sure if this is really a feature that would boost performance? What do you think?

phdussud commented 3 years ago

Anything that cuts down the number of USB packets is a performance improvement. Ideally if we would be able to send close to 64 byte each time, it would give us maximum performance. It turns out that the READOUT extension is helping but we can't send anything else after it.... I would be ok if we sent one packet per response, which won't materially change the structure of programs taking advantage of it, since they already need to do a USB read per response.

zoobab commented 3 years ago

"Anything that cuts down the number of USB packets is a performance improvement."

Packet batching, that's what Zeromq is using for optimising TCP performance.

jeanthom commented 3 years ago

I would be ok if we sent one packet per response, which won't materially change the structure of programs taking advantage of it, since they already need to do a USB read per response.

Do you think we should introduce a command flag/command list header in DJTAG2 for packet buffering?

phdussud commented 3 years ago

My proposal would be to change the DJTAG2 protocol to allow adding commands after one that returns a value. There would be one USB IN packet per return. I am not sure if allowing batching of returned values into one USB IN packet makes a great deal of sense because many times, the host program needs to read the first value returned in order to issue other types of commands, beyond "epilogue" commands that are part of the JTAG state machine. I also would specify that you can't add a CMD after a CMD_XFER since the rest of the buffer is assumed to be data. If there are a lot of short transfers, we could lift this restriction, at the expense of a more complicated implementation on the device side. We can do all of this without a new flag or command as long as we can break compatibility with DJTAG2 (we assume that all DJTAG2 devices have the new capability).

jeanthom commented 3 years ago

DJTAG2 hasn't been merged into master yet, we still can do huge changes to DJTAG2 as long as we don't break compatibility with DJTAG1.

phdussud commented 3 years ago

Ok, then I propose this. CMD_XFER is terminating: The command buffer will not be read beyond the last value to transfer. The value returning commands are not terminating: The next command in the command buffer will be processed. This means that all command buffers except for the CMD_XFER need to be terminated by CMD_STOP. I verified that openFPGALoader already complies. I will implement this shortly for the PICO implementation of DJTAG