Open bpalmintier opened 6 years ago
Proposed log level definitions for brokers/cores error/warning --as expected summary: connection/init(with summary)/termination normal: summary + fed connection/disconnection+some remote connection info for brokers debug: normal +interface registration, interface connections, exec entry/ time grant for each federate that passes through. trace: all messages
Proposed log level definitions for federates error/warning --as expected summary: connections/state changes normal: same as summary debug: normal +interface registration, interface connections, exec entry/ time grant trace: all messages
I think this sounds about right, but am confused how pub, sub, and messages fit in.
pub/sub/endpoints/filters the registration and connections are part of the interface connections section.
I put those as part of the debug level. The other option is put the interface connections as part of the normal level.
And put the actual data communication as part of the debug level vs trace? Trace is all messages but what goes in the
debug vs normal vs summary is in question. or whether they should be renamed to something else is a question.
For my own purposes I either want warnings/errors or trace since I often am debugging the timing. I just haven't put that much effort into the other levels. So I am trying to figure out what would be useful for different use cases.
I would think seeing the federate connections at a normal level would be useful, But what is appropriate for a debug level that isn't everything?
@trevorhardy might have some suggestions from the use case point of view. He wanted a diagnostics tool that could provide information about the data traffic at a higher level (at normal level).
When trouble-shooting with FNCS, the first step is to make sure that all the federates have registered. Having the information available at a higher level (like you do at "summary") is good.
The next thing we often do is make sure everybody is requesting and being granted a meaningful time. Something like the "normal" level of the broker in the hierarchy you've put together Phil.
If after looking at the time requests each federate is making a problem is still clearly evident, almost always the next step is to start digging into actual values being exchanged.
From a user standpoint, debugging a federation almost always boils down to watching the time requests, time grants, and message values. This may lead back to the source code of the individual federate but that is clearly a higher-level problem that we hope most users don't have to dig into (unless they wrote the simulator).
I would propose that we add a "timing+values/messages" level that would be most useful for the user but avoid all the messages a develop like @phlptp would need. Like @bpalmintier said, we would want federate names displayed at this level and probably we would need it to be supported both on the broker and on the federate side.
So you are thinking some level between debug and trace which includes the data values being transmitted?
I will answer that by saying it should be at a level that is above the need for knowing what specific APIs are being called. (I don't know what "interface registrations" are so I couldn't say whether those would be helpful or not.)
Interface registrations are the a federate defining its publications/subscriptions/inputs/endpoints/filters and then connecting them to other publications/endpoints/inputs.
Then, yes, I would say below debug and above trace is perfect.
I like where this is headed. One clarifying point is that I think the "ping" tracking for confirming up time is only needed at the most thorough (trace) level. Currently it occurs earlier and can start to dominate the display unnecessarily for federate and message tracking.
Looking back at the current breakdown, I wonder if the currently sparse lower levels might be shuffled/combined so that we can fit the data exchange into one of the existing levels. This might look like (for both brokers/cores and federates, and using suggested name revisions):
Would you want an ability to discriminate between a federate connection and an interface connection? The reason I ask is that there could be a large difference in the number of messages at each of those levels. Say there were 10 federates but if each had 10-20 interfaces that is a significant difference(20x or more) in the number of messages that display. Especially at the broker level.
One thing to keep in mind is that by design no level or component will have complete global information, so the brokers are not likely to be aware of much of the traffic going or be able to compute timing information. We talked about some of this in #438 as well. I think they are related issues.
I think what we might do in the interest of not getting bogged down is tweak some of this in a PR I am working on related to some logging stuff, and start down this path, then over time tune it to make it look more like what we are outlining here. I think it is going to be an iterative process to get this to an acceptable useful state.
This is a good discussion!
no_print = -1, //!< never print
error = 0, //!< only print errors
warning = 1, //!< print/log warning and errors
summary = 2, //!< print/log summary information
connections = 3, //!< print summary+ federate level connection information
interfaces = 4, //!< print connections +interface level connection information
timing =5, //!< print interfaces+ timing(exec/grant/disconnect)
data=6, //!< print timing+data transmissions
trace = 7, //!< trace level printing (all processed messages)
here are the current levels I am looking at.
Looks good to me, at least good enough for us to try using and see how it goes.
The logging levels have been integrated and logging messages, but more needs to be added to finish this issue. That will probably come in the 2.0 branch, as I am not 100% sure there will be any more substantive releases in 1.3.X series.
Feature Request / Support Question
The verbose outputs from
helics_broker
can be very helpful in debugging, but currently there are a few areas for improvement:cmd:reg_pub from XXXX
) but to also display the topic.