Open mbukatov opened 6 years ago
@fbalak I reported this as a suggestion to provide better event description to help with debugging. I haven't reported the problem itself, as it's likely caused by some glusterfs problem.
@nthomas-redhat Please fix this along with other log message fixes as discussed at (https://docs.google.com/document/d/138SFPUlRqdLjISMcd-Cts-vWzY7wfTGWi8GhdQHnh0Q/edit)
On Architecture Sync up meeting today, we decided that we are going to address it by:
In the long term, we may need to add tednrl api endpoint and enhance tendrl ui to show details for particular job id.
@r0h4n @mbukatov @nthomas-redhat @a2batic @gnehapk @shirshendu @mcarrano
When looking at this and thinking about event details further, it appears we don't get too much from the Events API at the moment, i.e. message, timestamp, message_id, priority.
We appear to be showing the message and timestamp at the moment.
+1 @mbukatov on needing more details on the particular job.
Here are some things that occur to me when showing the Event Details.
In the Events List, we should be showing a short event message and not a long, verbose event message. Moreover, the priority should be shown as well.
In the Event Details, we would show the event row/item again but with more details, e.g. we should show a long event message, along with the priority of it (if we don't show in the Event List). In addition, if we have a category/type for the Event, that would be good to show.
E.g. Short msg == gluster-195d43d86fd38ba5929e44529d1fa0b985f42f03946e0bb5ada6999805556674 is healthy Long (current) msg == Health status of cluster: gluster-195d43d86fd38ba5929e44529d1fa0b985f42f03946e0bb5ada6999805556674 changed from unhealthy to healthy
If the event contains the Job completed or failed, we should show details about what the Flow that was run.
E.g. Current msg == Job finished successfully (job_id: 14e7207a-02d4-4e97-a0c7-214bf71a91e8)
Suggested short msg ==
Ideally the event details would provide enough details so that it is actionable with guidance on how to resolve it if there's a problem or failure.
Thoughts?
I've create an Event Details page to display the details of an event as a drill-down from the events list. This is designed to display the full event message and link to any related resources. See https://redhat.invisionapp.com/share/HVGA7O575AZ#/285313287_Cluster_Details-Event_Detail
I also should note that the Event List, as designed, should display the event severity/priority before the short message. Let me know if you have any questions.
@r0h4n @mbukatov @nthomas-redhat @a2batic @gnehapk @shirshendu @mcarrano
Please note we've published the Event Details design. See previous comment by @mcarrano.
@julienlim
@nthomas-redhat is working on this issue, waiting for updates from him
@nthomas-redhat please close this if done
Description of the problem
When I open Events page of Tendrl ui, I see events like:
I don't immediately see what kind of job it is.
This could be especially confusing when I see lot of events like that, without any hint what's wrong (if anything):
Note that in the screenshot above, the message about successfully finished job repeats after few minutes.
When I tried to dig deeper and on the tendrl server machine tried:
I see only single log message related to this (with two occurrences though, one in node agent and other in messages log) and I read it as:
Which doesn't help me much with debugging of the event showed above, as it contradicts the original message (job finished successfully).
Expected Result
Event description may contain more details, eg. job type, to improve information delivered to the user.
Moreover we will need a description of the job id and how to use it for debugging. In my case, I'm unable to find any useful details for the event to go further.
Version
On Storage Servers:
On Tendrl server: