Closed ineiti closed 4 years ago
TODO (please add others):
AddField
is not suited for structures - writing an Add
method that stores everything in a given namespaceAs I understood it, tracing is quite different from logging. Logging show status/progress at each step, tracing try to measure a span of execution.
Yes, and that's the whole point of this PR: convert the current logging into a tracing, using the fact that golang's stack-traces are detailed enough to separate the different traces.
So whenever a log
comes in, the Trace
structure looks it up and checks whether that trace already exists. Then it inserts it correctly in the tree, simulating calls from methods to another. This is why it needs to keep track of context.Context
.
The alternative is to actively rewrite onet and cothority to use a tracing framework...
So whenever a
log
comes in, theTrace
structure looks it up and checks whether that trace already exists. Then it inserts it correctly in the tree, simulating calls from methods to another. This is why it needs to keep track ofcontext.Context
.
I might be wrong here but IMHO logs are something which are isolated to a single node and traces "span" across multiple nodes. Case in point, the hierarchy in the example at https://www.jaegertracing.io/img/trace-detail-ss.png.
IIRC, tracing libraries usually keep the span info in a context.Context
and propagate it for all network requests. I'm not sure if we're doing the same here because getTraceSpan
is only looking at the method signature. Am I missing something?
In other words, how would a message being handled by tree node at level N know the span info about its parent at level N - 1?
In other words, how would a message being handled by tree node at level N know the span info about its parent at level N - 1?
That's the limit of the solution proposed here. Theoretically one could push it further and use the protocol-ID to merge traces. But then I think it would be difficult to find a frontend that can visualize a BFT-BLSCosi signature. Or, even worse, a full trace starting from a ClientTranaction
request to the final block...
So the scope of this PR is to have a better log-aggregation that goes somewhat towards tracing. The traceWrapper
structure keeps track of the 'simulated' context.Context
structure. What I'm currently struggling with are the go-routines, as they are difficult to track through the string returned by runtime.Stack
.
OK, without adding extra information in the code this will be very difficult. I tried to get an automatic detection of which go-routines go together, but this is not something that go supports. So you have to scrape it together using runtime.Stack
and stuff like that, which makes it awful.
I think I'll add a log.TraceID
anyway where you can set an ID, if you have one. Currently the only ID that is available is RoundID
from the onet.Token
, which is used in protocols. But it is not ideal - the best would be of course to replace all the log.
messages with real tracing messages. But that looks like a lot of work :(
log.TraceID
is added and seems to work nicely. Now I can for example follow a blscosi
signature through the different phases. It's not too pretty, but not too bad, neither.
It's a bit finnicky to set up with putting the log.TraceID
in the right places.
OK, this looks somewhat usable. I'm waiting for @tharvik and @cgrigis to play around with honeycomb.io, and if they have a not-too-difficult proposition to be included, I'll do it.
Else this looks nearly final to me (except @tharvik's code suggestions, of course).
As soon as it passes the tests I think it's R4M. Except comments, of course.
Heisenbug - need to fix it first.
Found it - goroutines are of course not deterministic... Should be ready 4 merge now.
Instead of simply logging message to the console, syslog, or graylog, it is much better to do actual tracing of your code. To avoid having to add tracing messages throughout the code, this PR proposes a new onet-logger that converts the current log-calls to tracing-magick.
Currently there are two tracers written:
This code is currently tested on c4dt's 2nd conode, and it seems to work nicely.
Closes #618