Improvements to the live query UI

noahtalerman commented 3 years ago

This issue includes the upcoming improvements to the live query experience in the Fleet UI. Issues that discuss and contribute to the improvements are included.

Problem

The live query UI provides with limited tools to aggregate results and a sometimes confusing experience

The user is only provided with the number of offline hosts reporting. The user can't further investigate which hosts are reporting as offline. #211
The errors table is not available for export like the results table. If the errors table contains a significant amount of rows, the user's only tool is the filterable columns. (Slack thread) #300
The number of hosts that aren't reporting (offline) may appear inconsistent with the "Offline" filter on the Hosts page. #211
A successful response is not the same as a host returning results. There is inconsistency with the status bar displaying a successful response while the status text displays the number of results. (Loom demo)

Goal

Provide users with better tooling for aggregating results and improve the live query experience

Reveal which hosts are reporting offline by instilling confidence in the "Offline" filter on the Hosts page.
Provide better tooling for sifting through and identifying the main results and errors of interest. (Check out data stacking)
Provide a better visual summary for the live query by replacing or improving the status bar and text.
Allow pre-configured targets and improve target selection to reduce the time and complexity of selecting query targets #528

Rylon commented 3 years ago

Could I suggest also adding ability to sort by in the columns too, as well as filtering, clicking a column header to adjust sort order - last time I tried that it didn't work.

noahtalerman commented 3 years ago

Thanks for the suggestion @Rylon. The ability to sort columns is something we're looking to add to the live query results table.

I'm including your comment in Slack here as a reference.

Hi! The new “Errors” output when running a Live Query is very nice and helpful, but I was wondering if there was a way to export it, similarly to the “Results” table which is above it. It would make it easier to go through and identify the main errors when running against a large number of hosts.

Two follow-up questions:

If you had the ability to export the "Errors" table as you suggested, what are the next steps you or someone else would take for going through these errors and identifying the main errors?
Are these steps similar to those taken when trying to identify the main results of interest from the "Results" table?

anelshaer commented 3 years ago

previous version had a good info which is how many host out of total online hosts have replied to the query. current version [3.7.1] only shows total number of hosts online, and the number of results and they dont show how many hosts replied and how many didnot (still waiting for) and keep waiting for results to come for long time.

i hope that we show this number next to total number, and also have those hosts in the export data even if they have empty results.

Rylon commented 3 years ago

Two follow-up questions:

If you had the ability to export the "Errors" table as you suggested, what are the next steps you or someone else would take for going through these errors and identifying the main errors?

Are these steps similar to those taken when trying to identify the main results of interest from the "Results" table?

@noahtalerman so, I'm running Fleet across many disparate nodes. What I'd do with the data is sort it into different groups based on the operating system, and some other metadata that we're adding ourselves with a custom table in osquery. Some of the issues may be platform that I'd look at myself, others may be specific to certain vendors, which I'd assign to member of my team to look into.

But generally it would start with me skimming through the results and seeing what the main categories of error were and yes this would be similar to going through the "results" table I think, but it's good to have them separated.

Hope that help!

noahtalerman commented 3 years ago

previous version had a good info which is how many host out of total online hosts have replied to the query

i hope that we show this number next to total number, and also have those hosts in the export data even if they have empty results

@anelshaer these are two valuable pieces of feedback. The upcoming end of month release will "re-include" information for how many online hosts replied. Issue #303 discusses these proposed changes in more detail.

noahtalerman commented 3 years ago

@Rylon thank you for the explanation.

I'm wondering if a future version of the Fleet UI that can do some of these data sorting operations that you've described would be helpful. Instead of Fleet exporting the raw "Results" or "Errors" table, you could be able to perform some or all of the grouping prior to export.

mikermcneil commented 3 years ago

Setting expectations about how long you'll have to wait

Since Fleet's query response time is inherently variable (because of heartbeat timing), it can feel slow, or janky-- when in fact, this is the expected behavior-- it's part of how osquery works, and helps prevent performance issues on hosts.

@noahtalerman Maybe we could include a bit of help text beneath the loading spinner to help set expectations around this-- e.g.:

Waiting for next osquery heartbeat…

I think this would help users understand what's going on a bit better, and increase confidence in osquery in general.

For example, at 36:58:

"Fingers crossed"

fleetdm / fleet