orbcode / orbtrace

Debug and parallel trace hardware for CORTEX-M (FPGA + support code)
Other
136 stars 19 forks source link

CMSIS-DAP trace commands proposal #3

Open flit opened 2 years ago

flit commented 2 years ago

Hi there! πŸ‘‹πŸ½

You should know about the proposal to add trace command support to CMSIS-DAP. The proposal is from @gzied of Trande UG and was first proposed and is documented in ARMmbed/DAPLink#781. There is also a gdb extension that has been proposed on the Linaro mailing list. The CMSIS team are aware of this proposal.

The proposals are attached: DAP_TPIU_proposal.pdf gdb_remote_protocol_extensions_etm_btrace.pdf

Other links: Detailed memo to Linaro CoreSight mailing list Announcement on Linaro CoreSight mailing list

The gdb extension code is in gzied/binutils-gdb.

I'd very much like to see a standard come out of this collection of works, and not have multiple incompatible trace extensions arise. I don't know what your plans are for the host protocol, but hopefully we can pull together and produce a common interface that builds on the wide reach and compatibility of CMSIS-DAP. πŸš€

Fyi, I'm not associated with Trande UG. I'm just a maintainer of DAPLink and pyOCD, and an employee of Arm. I often represent open source community interests with the CMSIS team at Arm. (I'm not a member of the CMSIS team; I'm a systems researcher in Arm Research).

If you'd like, feel free to send me an email (check my GH account info) and we can talk offline, too.

cc @mbrossard

flit commented 2 years ago

(After reading your docs site a bit…)

I should mention that I have been planning a proposal to add power and voltage controls to CMSIS-DAP. I'll keep you informed.

mubes commented 2 years ago

Chris (et. al.),

We're in communication with Zied and he's doing some really cool stuff with the gdb integration in particular.

ORBTrace gateware is very modular and you can add and remove TRACE, CMSIS-DAP, POWER other functionalities (e.g. CPU) at will.

At the moment voltages and tracewidths are set via control transfers to the endpoints associated with TRACE and POWER respectively. I doubt there would be much problem extending this to also support messages that arrive via a CMSIS-DAP interface. Specifically, that would cover the Port Sizes, Trace Buffer Sizes and Transport.

We are a little bit concerned about the TRACE data flowing over the CMSIS-DAP interface. Placing TRACE data on a separate interface allows it to be terminated at a different handler on the host. Putting everything on a single interface precludes that possibility. Also, on the generation side, if its on the same interface it would probably have to go via the same processing entity that is handling the rest of the CMSIS-DAP flows. That is quite a big ask for a stream that can easily be 400Mb/s. There's obviously no reason why a single host entity can't open multiple interfaces, so it just adds flexibility; We are looking to remove the TPIU framing on the FPGA so that ITM and ETM flows can receive differentiated management, for example (it also reduces bandwidth requirement as padding is removed).

For now we are concentrating on getting the first release stable, and then we will look at changes like this. We're very open to whatever works the best and are clear that, at least in the short term, this is a developing project so folks will be expecting changes and extensions as time go on...we just need to be careful to not remove any functionality.

Regards

DAVE & VEGARD

On 12/10/2021 20:16, Chris Reed wrote:

Hi there! πŸ‘‹πŸ½

You should know about the proposal to add trace command support to CMSIS-DAP. The proposal is from @gzied https://github.com/gzied of Trande UG https://trande.de/ and was first proposed and is documented in ARMmbed/DAPLink#781 https://github.com/ARMmbed/DAPLink/issues/781. There is also a gdb extension that has been proposed on the Linaro mailing list.

Other links:

Memo to Linaro mailing list https://lists.linaro.org/pipermail/coresight/2019-July/003021.html Announcement on Linaro mailing list https://lists.linaro.org/pipermail/coresight/2020-February/003676.html

The gdb extension code is in gzied/binutils-gdb https://github.com/gzied/binutils-gdb.

I'd /very much/ like to see a standard come out of this collection of works, and not have multiple incompatible trace extensions arise. The CMSIS team are aware of this proposal. I don't know what your plans are for the host protocol, but hopefully we can pull together and produce a common interface that builds on the wide reach and compatibility of CMSIS-DAP. πŸš€

Fyi, I'm not associated with Trande UG. I'm just a maintainer of DAPLink and pyOCD, and an employee of Arm. I often represent open source community interests with the CMSIS team at Arm. (I'm not a member of the CMSIS team; I'm a systems researcher in Arm Research).

If you'd like, feel free to send me an email (check my GH account info) and we can talk offline, too.

cc @mbrossard https://github.com/mbrossard

β€” You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/orbcode/orbtrace/issues/3, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJTBDZFHGQYFY6XAVN3ULTUGSCSTANCNFSM5F3LBQ6A. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

flit commented 2 years ago

That's great you're already talking with Zied.

I certainly understand focusing on getting a first stable release. Looking forward to seeing the end result!

Regarding the trace endpoint and interface, that makes sense and is the kind of feedback we need.

We may want to support both cases of TPIU framing included and not. As you probably know, Zied wants to completely avoid software intervention and data massaging on the probe side, so the FPGA fabric can directly perform USB DMA transfers. While you could do that in hardware, it would add complexity, gate count, and perhaps latency. Whereas the host generally has plenty of spare cycles for that kind of thing even at USB3 speeds. It might be easier and more flexible to have a host-side process that receives the raw data, de-frames and serves separate streams over TCP.

Btw, I'm keen to add support for trace to pyocd as soon as feasible and time permits.

Cheers -c

mubes commented 2 years ago

We may want to support both cases of TPIU framing included and not. As you probably know, Zied wants to completely avoid software intervention and data massaging on the probe side, so the FPGA fabric can directly perform USB DMA transfers. While you could do that in hardware, it would add complexity, gate count, and perhaps latency. Whereas the host generally has plenty of spare cycles for that kind of thing even at USB3 speeds. It might be easier and more flexible to have a host-side process that receives the raw data, de-frames and serves separate streams over TCP.

Our current implementation is completely gateware (including CMSIS-DAP), there is no CPU intervention at all....there is a VexRISCV optionally in the build, but we don't actually use that at the moment.

The TPIU deframing will also be done directly in gateware, there's no need for a CPU to do that job.Β  On the host the de-framing does take a significant amount of effort and, more importantly, frame padding can add up to around 40% wasted bandwidth on the link, so we want to avoid that if possible.

Btw, I'm keen to add support for trace to pyocd as soon as feasible and time permits.

It would probably be a good idea to focus on TRACE instigation and control rather than the handling of the flow....that can be easily done by Orbuculum and friends and, to be fair, python isn't the best environment for handling 40MByte/sec streams :-)

DAVE

gzied commented 2 years ago

Hi Dave, Chris, the proposals brought to the community, either to extend CMSIS-dap interface, OpenOcd, gdb and eclipse are considering different use cases/implementations and are meant to be a starting point for an industry standard.

@Dave please start checking the proposal pointed out by Chris, it is there to be reviewed, discussed and amended.

Port size, trace buffer etc... are already part of the proposal. Other aspects like frame sync timeout, power level etc.. can be added. Power level is usually buried down in hardware/firmware either using active sensing or delegating it to components that does this automatically. It is for example not part of the cmsis dap interface for debugging, probably for such reasons. It is not tracing specific and should be done at the upper level.

It is very attractive to do the deframing in hardware, and as we discussed it, I did not consider it only because it may be in a grey zone of some existing patents. If you do the deframing in hardware, you have to report the correct parameters to the consumer of trace data. This info comes from the registers of TPIU, and openOcd/pyOcd, needs to collect and report them. You will need then to change the content of values retrieved from those registers by the debugger driver to let the consumer of the data know the actual formatting in place.

/Zied

flit commented 2 years ago

Our current implementation is completely gateware (including CMSIS-DAP), there is no CPU intervention at all....there is a VexRISCV optionally in the build, but we don't actually use that at the moment.

That's really cool!!

python isn't the best environment for handling 40MByte/sec streams :-)

It depends on what you're actually doing and how much is done in native (C/Rust/etc) extensions. For the major use case of passing trace data to gdb or a tool like Orbuculum, very little would be done in Python so performance should be ok (remains to be proven of course), or processing directly with the Python API and commands. The latter would likely be processing smaller chunks of trace data, and could be built on native extensions if performance isn't acceptable, while still providing a convenient and simple Python API.

mubes commented 2 years ago

Chris,

I'm not going to get into the discussion about when python is appropriate and when it isn't...it's certainly not my favourite development language but I've seen some incredible things done with it, is very quick time, so it would be extremely difficult to criticize it!

In any case, all of the orb decoding stack is built to be lib-rified so it can be used as high speed extensions for interpreted languages. We're not there yet, but this whole ecosystem is being built to be extensible and pluggable....for way too long the silicon has been able to do things that no reasonably priced tool has been able to access. Not only that, but there are a lot of additional cool things that can be done with these information flows that haven't really been investigated yet because the toolchain APIs have been closed, or expensive.

We're trying to develop something that is amenable to as many people as possible, with active community support. Zied is aiming one 'tier higher' than that, with proper commercial product and a company behind it that can support users; the two approaches should be complementary.

Regards

Dave

On Thu, 14 Oct 2021, 15:37 Chris Reed, @.***> wrote:

Our current implementation is completely gateware (including CMSIS-DAP), there is no CPU intervention at all....there is a VexRISCV optionally in the build, but we don't actually use that at the moment.

That's really cool!!

python isn't the best environment for handling 40MByte/sec streams :-)

It depends on what you're actually doing and how much is done in native (C/Rust/etc) extensions. For the major use case of passing trace data to gdb or a tool like Orbuculum, very little would be done in Python so performance should be ok (remains to be proven of course), or processing directly with the Python API and commands. The latter would likely be processing smaller chunks of trace data, and could be built on native extensions if performance isn't acceptable, while still providing a convenient and simple Python API.

β€” You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/orbcode/orbtrace/issues/3#issuecomment-943422157, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJTBD53L7OHE3VB5M7MTT3UG3TMLANCNFSM5F3LBQ6A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

flit commented 2 years ago

I'm not going to get into the discussion about when python is appropriate and when it isn't.

Sorry, didn't mean to pry open that can of worms! πŸ˜„

the orb decoding stack is built to be lib-rified

Good to hear, that's just what I was hoping for.

for way too long the silicon has been able to do things that no reasonably priced tool has been able to access

Right on the mark!

That's part of my interest is expanding CMSIS-DAP in this regard. The availability of CMSIS-DAP and its wide support in tools has made possible the extensive ecosystem of open source debug probes. Stepping that up to support trace can only benefit the whole embedded developer community.

lot of additional cool things that can be done

Completely agreed. That is one of the ideas behind pyocd, too.

Overall, it sounds like all of our projects fundamentally have the same motivation and goals.

zyp commented 2 years ago

The Orbtrace gateware is designed to be modular and flexible, so I see no reason we wouldn't support a CMSIS-DAP trace extension. On the other hand, we naturally also want to keep the current interface, and the gateware architecture makes it easy to build it with support for either or both.

For perspective: CMSIS-DAP support for debug is also an optional feature of the gateware. I'm personally a big fan of having a gdbserver running on the debug unit itself so that gdb can connect directly rather than requiring additional middleware on the host. We've got the modularity to support either, although the latter is not implemented yet.

flit commented 2 years ago

Sounds good!

From my experience, having the gdbserver on the debug unit makes for greater complexity and more difficulty for managing target support (especially flash algos and "special" debug logic). It also means the debug unit firmware can be more stable and simple. (Although DAPLink has USB MSD flash programming, which adds most of that complexity back in…) But that's just a personal view. πŸ˜‰

mubes commented 2 years ago

From the point of view of what we're trying to do with the Orb tools, and ORBTrace in particular, both options are equally valid...we're just trying to spook the horse to get it moving...what direction is goes in after that we're not too worried, as long as it's vaguely a forward one!

On 18/10/2021 23:45, Chris Reed wrote:

Sounds good!

From my experience, having the gdbserver on the debug unit makes for greater complexity and more difficulty for managing target support (especially flash algos and "special" debug logic). It also means the debug unit firmware can be more stable and simple. (Although DAPLink has USB MSD flash programming, which adds most of that complexity back in…) But that's just a personal view. πŸ˜‰

β€” You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/orbcode/orbtrace/issues/3#issuecomment-946225548, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJTBD5MPPYWA6VWTFQZ5GTUHSPQBANCNFSM5F3LBQ6A. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

zyp commented 2 years ago

From my experience, having the gdbserver on the debug unit makes for greater complexity and more difficulty for managing target support (especially flash algos and "special" debug logic).

Yeah, I agree it's a two edged sword, which is why I'd like to have both the gdbserver option that'll cover 90% of my use cases and the CMSIS-DAP option that's able to be used with whatever host middleware that's able to provide the remaining 10%. πŸ˜‰