orocos-toolchain / rtt

Orocos Real-Time Toolkit
http://www.orocos.org
Other
72 stars 79 forks source link

add tracing support using lttng-ust #187

Closed doudou closed 5 years ago

doudou commented 7 years ago

This adds basic infrastructure, and the most useful trace points (in/out of hooks, read/write ports). It's actually a very old branch, initiated by Paul Chavent, that I revived two or three times over the years. I've used it extensively in the last weeks, and I'd like to finally have it in mainline.

Even if the support is compiled-in, it is not enabled unless (1) the tracing library is LD_PRELOAD'ed into the RTT binary and (2) lttng itself is enabled. (1) is critical to make the support no-cost if one has it built, but does not use it.

In order to avoid memory allocations in RTT itself on the tracing path (which is a fairly hot path), I added C-string versions of some objects (ports, tasks) that are passed to lttng on tracing. I have to admit that I don't know what are lttng's own memory allocation profile though.

snrkiwi commented 7 years ago

Interesting work Sylvain.

1) Suggest making this a feature that you ENABLE, rather than one you do NOT DISABLE.

2) Out of curiosity, why are some definitions outside the multi-include guard in rtt/os/gnulinux/traces/lttng_ust.h?

3) How does this differ from existing facilities like kernelshark and ftrace? We've used these extensively to see what is going on within a deployment.

4) The need to have cName is a bit of a PITA, but understandable I think.

Whatever you're tracing with this it doesn't look like a lot of fun ......

doudou commented 7 years ago

Suggest making this a feature that you ENABLE, rather than one you do NOT DISABLE.

I was expecting one like that. I'll do it !

Out of curiosity, why are some definitions outside the multi-include guard in rtt/os/gnulinux/traces/lttng_ust.h?

It's lttng-mumbo-jumbo. It basically ensures that the probes are generated in the tracing library, but that the rest of RTT only sees "empty hooks" that will be then replaced through LD_PRELOAD at runtime.

How does this differ from existing facilities like kernelshark and ftrace? We've used these extensively to see what is going on within a deployment.

I honestly don't know. The RTT-lttng work is already a few year old, so it was very low-cost for me to refresh it. With the very limited and theoretical knowledge I have of kernelshark/ftrace, I would guess that the biggest difference is that lttng-ust is really geared for userspace, and allows to define custom events - so that one does not need to know about RTT's internals (resolving function names, ...) to use the tracer. It also has ready-to-use generic userspace and kernel tracing (scheduling, memory allocation, ...) that you can mix with your custom events. After quick googling, it seems that the consensus is that lttng is great for userspace, and ftrace is more designed for kernel tracing. Now, I take it with a grain of salt, it's really a quick glance at the blogosphere's opinion on the subject ;-)

Does ftrace generate ctf traces ?

Whatever you're tracing with this it doesn't look like a lot of fun ......

It was https://github.com/orocos-toolchain/rtt/pull/182 ... So not really, no ...

Now, it did force me to add relevant support to our system manager (Syskit), which now also generates tracecompass-compatible traces (ctf). This is now pretty awesome.

doudou commented 7 years ago

@meyerj I've implemented your suggestions regarding the naming on TaskCore. If you prefer, I can hold on to this PR until I have a better idea about the C-string thing (but that's going to take some time ...)

doudou commented 6 years ago

Ping

doudou commented 6 years ago

Thanks @meyerj. Your suggestions sound good. I'll have a go at them when I have the time.

The more tracing points, the merrier. So, :+1: to add more, but I would ask for your help there, you know this part of the codebase much better than I do. I would prefer keeping the existing hooks configure/start/stop/cleanup, because even if they're not critical for realtime analysis, they are still useful for multiprocess race analysis.

francisco-miguel-almeida commented 5 years ago

I could successfully trace the execution of a unit test using this patch. It might require some documentation on how to use it (with LD_PRELOAD).

The feature indeed needs to be documented, some explanation definitely will need to be added in the changelog for the release where LTTng support is introduced.

LGTM after the following minor comments have been addressed.

Removed the variadics from the macros, confirmed on my end as well. This (along with the error message typo) being the only comment that needed to be addressed, I can now merge to master.