mirage / mirage-profile

Collect profiling information
BSD 2-Clause "Simplified" License
18 stars 10 forks source link

future of tracing? #32

Closed hannesm closed 1 year ago

hannesm commented 5 years ago

I observed since mirage-xen 3.4.0 got released CI failures where tracing is enabled (see https://travis-ci.org/mirage/mirage-skeleton/jobs/527375387#L3697). This made me think of an earlier thread on mirageos-devel on how to use the mirage-profile and tracing (see https://www.mail-archive.com/mirageos-devel@lists.xenproject.org/msg03023.html - that shows the toolchain being slightly bitrot) - we then successfully used landmarks (on Unix) to get profiling information.

So, what is the desired future for mirage-profile and the mirage-trace-viewer? Is it used and useful (it looks to me that the mirage/lwt fork on the tracing branch is also slightly bitrot)? I at least didn't have much success in getting it setup and extracting data from it.

As mentioned above, for the time profiling, landmarks is pretty nice (and it'd be great to be able to run this in standalone unikernels). For space profiling, I successfully used statmemprof (see https://github.com/hannesm/statmemprof-mirage) - it looks that'll be upstreamed to OCaml rather sooner than later. For monitoring counters, the metrics library is pretty nice, and we should finish and deploy it.

TL;DR: is there interest to keep this maintained, or should we drop support and remove the travis jobs that run the tracing?

talex5 commented 5 years ago

I've updated it a bit and successfully got traces from Unix and Xen using it (see #33).

But I suspect that getting it in good shape for the new Lwt would require a few weeks' work, and the trace viewer needs rescuing from oasis-hell.

Despite the name, mirage-profile is generally more useful for tracing than for profiling, and it's more related to Lwt than to Mirage.

hannesm commented 5 years ago

@talex5 thanks for your PRs and your reply. It is still unclear whether you have plans to upstream the Lwt tracing work to lwt (as people suggested e.g. in https://github.com/ocsigen/lwt/issues/180) ,which would move that off the plate here!?

What is also unclear to me is whether this repository, mirage-profile, would be needed for instrumenting lwt? If this is the case, should it not be contained into lwt+tracing then? I suspect the only MirageOS-specific thing would be how to communicate the gathered tracing information to the host system, is that right? I'd suggest to then also rename this repository and opam package to mirage-tracing instead of profile, and also revise the documentation at https://mirage.io/wiki/profiling to be more specific and only talk about tracing.

As you mention, the trace-viewer needs some maintenance as well. Or are the traces viewable/analysable with other tools, apart from the trace-viewer?

Please don't get wrong: I'm trying to understand the involved libraries and data formats (as well the hooks required to set it up), to figure out how to move forward (i.e. maybe we can find a unified way to also support statmemprof and landmarks (where the mirage-specific bits are (a) communication setup and (b) data exchange format and setup) to have good stories for both tracing and profiling in MirageOS unikernels). I'm as well interested in using your trace work with solo5. From what I can tell, mirage-logs has this tracie-ringbuffer setup, where I'm curious whether this is actually used anywhere...

talex5 commented 5 years ago

Yes, ideally lwt+tracing and mirage-profile would become part of lwt itself, although I don't currently have any plans to do that myself. mirage-profile-unix would probably become part of lwt too so that people could use it easily, but we would still need mirage-profile-xen.

The traces can also be read by babeltrace, which can be useful for debugging the tracer, or maybe for converting to a different format (I haven't tried that).

mirage-logs has two separate features:

  1. A ring buffer for log messages. This allows you to log at debug level while only writing at info level to the console. If the unikernel raises an exception, it then dumps the contents of the ring buffer so you can see exactly what happened just before the crash. Writing to the Xen console at debug level all the time would be very slow because the console is rate-limited.

  2. Any message written to the console or log ring buffer is also written to the mirage-profile trace buffer. This means that log messages appear in traces, giving extra context.

hannesm commented 1 year ago

4 years later, anyone against archiving this repository?

hannesm commented 1 year ago

I'll just close this issue, nothing to see here.