elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.77k stars 8.17k forks source link

[Canvas] Element templates #25531

Closed w33ble closed 1 year ago

w33ble commented 5 years ago
screen shot 2018-11-12 at 10 34 16 am

Templated elements are groups of pre-arranged elements that form a single element, parts of which can be customized individually

For example, the 9% chart above is actually a pie chart, a grey circle, and a number. This allows for composable elements that fit the user’s own style

These can be shipped as plugins or saved by the user for use later. They can be rolled into workpad themes as “featured elements” with coordinating colors. In order to allow for exporting and sharing of these components we need to implement element group. See the “Workpad enhancements” theme

elasticmachine commented 5 years ago

Pinging @elastic/kibana-canvas

elasticmachine commented 5 years ago

Pinging @elastic/kibana-design

ryankeairns commented 5 years ago

UI considerations...

monfera commented 5 years ago

Thanks @w33ble @ryankeairns great writeup!

ryankeairns commented 5 years ago

We should get together and talk this through once we're closer to beginning work, but here are some initial takes on how element templates might look in the UI.

There are a few of assumptions here that should be considered:

Side panel controls to manage grouped layers, including ability to 'Save as element'

canvas - workpad - element template popover

New element template now appears in 'Add element' modal

canvas - workpad - add element with templates

monfera commented 5 years ago

@w33ble @Stacey-Gammon @clintandrewhall @alexfrancoeur I'm just thinking out loud here. From the user's viewpoint, the new composite element should act as much like a basic element as possible. This issue even mentions the possibility of turning a composite element into a plugin.

tl;dr elements in a parent-child relationship may need to channel data into one another, similar to what happens when composing with functions in most programming languages.

Currently, each elements represent a singular, concurrent thread in the pipeline, starting with filters and a data source pointing towards the "outside" (basically, the singleton global state represented by the ES server) such as essql (well... elements can pull data from external routes, but, same thing).

For composite elements to work, the member elements ideally would not have such external inputs. Their input would come from their parent. Think of the top image with the 9%. Maybe the 9% is a markup, the value coming from an essql search, and the same value drives the color of the ring (represented as a shape). In this setup, the constituent elements shouldn't run their own query or handle their own filtering. It's most important when the template isn't mere embellishment around the content. We probably want to enable the separation of data acquisition, logic and rendering. Otherwise our solution would be basically just persistent grouping with copy/paste.

Templates can be presumably nested, too.

So we arrived at the concept of the execution stack. The container element "invokes" the constituents with whatever data it wants. Moreover, with Stacey's upcoming events, the constituents may also return values, such that only the container element would be privy of how to handle and combine the events, ie. the events may be more analogous to function outputs than global side effects such as calling a console.alert.

To bring it to its logical conclusion, the pipeline - despite the name - could become something like a programming language with an execution stack, lexical scope, multiple function arguments and return values.

Maybe it's not just renderable elements that could be composed. Maybe the user could define some sequence of pure logic (keep these columns, take the average of whatever etc.) and then reuse it. After all, it's allowed for an element participating in a template to contain only pure calculation steps.

To continue the loud out thinking, this could lead to this data type, crudely:

{
  stack: [
    {
      foo: { type: 'datatable', columns: [], rows: [] },
      bar: { type: 'pointseries', columns: [], rows: [] },
      ...
    },
    {
      baz: { type: 'datatable', columns: [], rows: [] },
      bar: { type: 'datatable', columns: [], rows: [] },
      ...
    }
  ],
}

Eg. the topmost stack element representing the current execution context. (As bar is duplicated, it'd be thepointseries one, due to higher specificity on the stack.)

Please lmk your views on this, am I overthinking it? Should the 1st version just rely on each constituent to query ES on their own, ie. remaining quite close to a persistent group, without being able to edit the expression for the newly created container element?

stacey-gammon commented 5 years ago

In this setup, the constituent elements shouldn't run their own query or handle their own filtering.

Why not? Because of unnecessary duplication of work, or because of possible timing issues (one query gets back 9%, another query gets back 10%, same two queries in a template returned different results because one was executed slightly later than the other)?

Do we have these same problems now, when executing all the pipelines on a workpad? I heard of possible caching to help. If we solved the problem at the level of the workpad, do we have to solve it at the level of the element template?

Otherwise our solution would be basically just persistent grouping with copy/paste.

What is the con of having just persistent grouping with copy/paste? (asking because genuinely curious - I haven't formed an opinion one way or the other yet). Maybe it's not as portable? e.g. supporting elements in a dashboard is pretty straightforward with visualizations being implemented via the pipeline, but how would we be able to support element groups in a dashboard?

We should think about inter app communication here. Would be an interesting exploration to see if the logic could be encapsulated inside an embeddable.

alexfrancoeur commented 5 years ago

Please lmk your views on this, am I overthinking it? Should the 1st version just rely on each constituent to query ES on their own, ie. remaining quite close to a persistent group, without being able to edit the expression for the newly created container element?

In order to introduce the feature and warrant a more complex implementation, I'd say that's a good start. In it's most simplistic fashion, I'd say we could even have an element template simply be a copy and paste of that element and remove the parent / child relationship completely (though I'd love to have this in the future). It's a very similar scenario to what we did with workpad templates and @cqliu1 might be able to provide more insight.

Rather than introducing complex logic for dependencies, we could save the element created and make it available in the element library. Then when a user chooses it, they have a similar experience as they do today with demo data. They're presented with a working element and have to tweak for their needs. Though, I do see some gaps with this approach.

Ideally, I'd love to see an element template that has the option to be "locked" or has "locked components" as defined by an admin. I could imagine a UI that only allows you to select data sources and specific customization options and then each time the admin updates the parent element, the child elements also get updated. However, I think we could keep the initial approach as simple as possible. This way we can gather feedback and see if there is a need for such features.

alexfrancoeur commented 5 years ago

Ah, I just saw @Stacey-Gammon comment. Maybe this is worth a live discussion?

stacey-gammon commented 5 years ago

Agreed, I think there are a lot of questions still, especially if you are bringing up security @alexfrancoeur - I've been wondering if some of these template objects should be Saved Objects, so they are space aware, will get RBAC when that is implemented, get import/export functionality for free, etc.

monfera commented 5 years ago

Hi @Stacey-Gammon,

Thanks for the questions!

In this setup, the constituent elements shouldn't run their own query or handle their own filtering.

Why not? Because of unnecessary duplication of work, or because of possible timing issues (one query gets back 9%, another query gets back 10%, same two queries in a template returned different results because one was executed slightly later than the other)?

Let's say, someone abstracted out (assembled) a fancyGauge which has an image background, some shapes in it with scriptable position/angle, a metric number and min/max values on an arc. Then we want that this widget work basically like any other element, ie. from the viewpoint of another person, it's like an element. Maybe a superuser made it and an end user consumes it for a hallway dashboard. It would then be, I think, more natural to expect that the thing has a centralized input. Ie. it would be possible to instantiate a new fancyGauge, and link it to new data, by eg. replacing a demodata | ... with an filter | essql ... input redirection. A user would likely expect that all values/positions/angles etc. of fancyGauge will follow the new data and remain self-consistent. To me at least it would feel more surprising and laborious if I had to go into the metric element of the fancyGauge to make sure it uses some new data source. So in some way it's abstracting out the rendering of the composite thing. While fancyGauge is a number decoration, for other types of templates, eg. creating a small multiples chart out of a line chart, it'd be similar.

Timing and performance values are also valid concerns, though in this context they are more technical ones (which might have - likely more tedious - alternative solutions) and I haven't considered them here, good points!

Do we have these same problems now, when executing all the pipelines on a workpad? I heard of possible caching to help. If we solved the problem at the level of the workpad, do we have to solve it at the level of the element template?

While temporal consistency wasn't on my mind, this question stands on its own legs, important to discuss. If a workbook (or analogous thing elsewhere in Kibana) uses the same data for a bunch of things, then it seems preferable to query - and further massage - the data once, and make the results of that available wherever it's needed. Otherwise it'll be quite hard to modify workbooks. For example, a user wants to simply reuse a nice workbook made by a coworker, the only difference being the source data, or some calcs on that data. It's better if the new user can do it at one place, and mostly trust that the new presentation will not feed from totally unrelated things. Maybe there's even some soft infosec or privacy aspect, eg. a user accidentally exposing some confidential data by forgetting to update the filter | essql ... in the 19th element on page 6.

There's also the technical aspect. Temporal consistency feels super hard to get correct, performant and space safe if it's not baked in the basic storage mechanism (eg. via relational database ACID transactions, or via a bitemporal database). Also, it's much easier not to duplicate pipelines, than to promote duplication (eg. simply by making it the salient or only option for the user) and then try to deduplicate. For a query that's repeated will likely have subtle differences even in the pipeline expression; sometimes the ability to deduplicate would even need smart things like splitting an SQL statement to a column selection and a filtering WHERE, and in this split version, eg. the column selection can be unified with some other instance with the same column selection but different filtering. Performance is integral to the product because users' needs for information quantity and low update and interaction latency will only grow; Kibana will continue to be compared with alternatives that put ever more focus on high performance and scalability, so the perf bar will keep rising and dataviz is more demanding than general UI. There's likely a need for eventual mobile support too.

In short, it looks to me that temporal consistency and decent performance require that semantically identical or overlapping processing steps are not done in isolation to be then deduped and optimized (both are hard or impossible, especially because neither ES nor the pipeline rely on relational algebra), instead, we could rely on the benefits of data flow languages (of which the pipeline is an instance already) to help the user cover the common things under one pipeline expression "roof".

Otherwise our solution would be basically just persistent grouping with copy/paste.

What is the con of having just persistent grouping with copy/paste? [...] Maybe it's not as portable?

Exactly, the reusability would be more tedious. There are bound to be places where indeed, copying and pasting (and then maybe updating expressions in) a persistent group is ideal, and we have copy/paste for this already. For example, the user just needs two like things on the same page, or maybe the same thing but on two adjacent places, and they don't otherwise want to "abstract stuff out". If the user wants reusability, including the ability to show the new thing on the Elements roster and paste the new thing in wild abandon, then they should be near-indistinguishable from eg. a plugin that provides a fancyGauge via JS code - unless the user wants to descend into its constituents, tweak the thing or mash-up new, larger things.

We should think about inter app communication here. Would be an interesting exploration to see if the logic could be encapsulated inside an embeddable.

I'm not versed in this, would be glad to learn broad concepts on a call or from a summary, seems super important.

monfera commented 5 years ago

@alexfrancoeur I was reading from North to South, partially but not fully replying with the above comment to your concerns (or rather, relaxation of implementation constraints, which is always welcome for an implementor!).

I mentioned soft infosec concerns above (user forgets to update something somewhere - though we may be talking about things post element templates, relating to Stacey's forward looking questions - and accidentally exposes private data). Your example is better in that the thing would simply break, given that another user or setting could render input data inaccessible. At least this problem is easier to detect than the issue with accidentally exposing some client or personnel data.

But even the more complex approach I outline above doesn't solve this problem, because the template itself could rely on assets and data that are no longer there. So a careful template maker will start from demodata or a demo dataset to ensure that initial exposure to the template is smooth, the same way it's expected from plugin writers.

Regarding the actual suggestion, if we can say that persistent group copy/paste is all we need for this milestone, then we could consider the pipeline aspect done, to refocus on a mechanism to select a persistent group for inclusion on the Elements panel. Though then wouldn't we forfeit functionality implied in the top description, for which the examples are "coordinating colors" (what would do the coordination if each element would still do all its own styling?) and maybe even shareability. Having said that, I'm glad to postpone the tougher questions.

As an in-between thing, which would require, I think, no change to the pipeline fundamentals, would be to 1. allow the user to add a pipeline expression to the group itself; 2. channel its results into each of the children elements. This way, a user who just groups the elements still succeeds; and if some users want to properly abstract out the data aspect, they can simply put the querying and common processing thread into the group expression and not start the children elements with filter | essql, instead, letting the parent feed the data.

alexfrancoeur commented 5 years ago

To me at least it would feel more surprising and laborious if I had to go into the metric element of the fancyGauge to make sure it uses some new data source.

I agree that it would feel laborious and in an idea world, the creator of the template could pick and choose what configurations are available through the UI. For the first implementation, I'd be fine with the laborious approach and simply having a re-usable grouping of elements.

For example, a user wants to simply reuse a nice workbook made by a coworker, the only difference being the source data, or some calcs on that data.

While not always the case, source data could very well be multiple sources. Given previous comments, I believe we're thinking in this manner but I read this sentence as referring to a single data source.

Maybe there's even some soft infosec or privacy aspect, eg. a user accidentally exposing some confidential data by forgetting to update the filter | essql ... in the 19th element on page 6.

We could easily push responsibility here to the admin for roles / permissions on the data iteself. I wouldn't worry too much about the infosec / privacy aspect.

Regarding the actual suggestion, if we can say that persistent group copy/paste is all we need for this milestone, then we could consider the pipeline aspect done, to refocus on a mechanism to select a persistent group for inclusion on the Elements panel. Though then wouldn't we forfeit functionality implied in the top description, for which the examples are "coordinating colors" (what would do the coordination if each element would still do all its own styling?) and maybe even shareability. Having said that, I'm glad to postpone the tougher questions.

I've been wondering what we could do here as well, for things like "coordinating colors".I am concerned about users opening up empty / errored element templates. What if we allowed users to store a quick snapshot of the fields and values with the template itself. We could randomize the values behind the scenes, and maybe even the fields but it would allow any user to come in, select a template and immediately have an experience like they do with demodata, coordinating colors and all. We could even show a new "element template data" card under data sources for that template. Depending on the implementation, this could have some infosec issues but if we put ownership on the user to turn that option on, we should be fine. Also, I'm not sure how best we would represent these element templates in the current selection box. I feel like we'd need some way to store a screenshot of the grouping itself at the time of element template creation.

As an in-between thing, which would require, I think, no change to the pipeline fundamentals, would be to 1. allow the user to add a pipeline expression to the group itself; 2. channel its results into each of the children elements. This way, a user who just groups the elements still succeeds; and if some users want to properly abstract out the data aspect, they can simply put the querying and common processing thread into the group expression and not start the children elements with filter | essql, instead, letting the parent feed the data.

+1, I like this in-between thing.

monfera commented 5 years ago

Toposorting topic, any input is welcome!!! cc @Stacey-Gammon @rashidkpc @ppisljar @w33ble

I'm about to add topological sorting (like with a spreadsheet, calculate cells in such an order that all inputs to a given cell from upstream cells are available when needed). Now we're just looping through the elements in, well, elements.js. Based on Stacey's earlier note on avoiding coupling general functions with Canvas specifics (which would limit reuse) it'll be extracted out. Now the canvas expressions are calculated in an essentially arbitrary order of the elements; with https://en.wikipedia.org/wiki/Topological_sorting it's from top toward the bottom of the persistent grouping tree, ending with the leaf elements (our graph is not a general DAG now but it's an easy algo). It might be handy for eventual common-across-workpad and common-across-page calculations.

[image to merely break up text, alludes to hierarchical element grouping trees as it'd be the first use case] image

The hard part (for me) will be to run through dependencies for asynchronous execution. So we can't just loop through synchronously, like now, the nodes (groups+elements), as the upstreams will have been scheduled but not necessarily completed.

Another thing is where to take the value from. Even now, the transient prop in the Redux Canvas state receives the result of pipeline expressions (not sure of the reason, it does have memory footprint). But I'd not rely on it, again, to not couple general pipeline stuff with Canvas.

On the other hand, I think there's not yet a mechanism for toposorting or other higher level ordering of canvas expression executions (fibers? haha), ie. now they all execute parallel - or has something been developed or designed for the pipeline use outside Canvas? If not, is there a conceived need or constraint I should heed?

w33ble commented 5 years ago

Even now, the transient prop in the Redux Canvas state receives the result of pipeline expressions (not sure of the reason, it does have memory footprint).

That's used for caching values for the sidebar to use. We cache the value of every function when the sidebar is loaded so we can reuse the input of whatever function you are changing without having to re-run the more expensive parts of the expression (generally, the first 1 or 2 functions).

On the other hand, I think there's not yet a mechanism for toposorting or other higher level ordering of canvas expression executions (fibers? haha), ie. now they all execute parallel

That's because every element is its own thing, the expression is functional and nothing is shared. That's probably not something we want to lose either. The two models we have for "sharing" information right now are filters and assets, which are both quite coupled to the Canvas app. That seems like the correct way to do it, even if that functionality is also shared across apps like the expression now is.

As an in-between thing, which would require, I think, no change to the pipeline fundamentals, would be to 1. allow the user to add a pipeline expression to the group itself; 2. channel its results into each of the children elements.

I like this in general honstly. It allows elements in the group to use a single dataset/context without needing to rely on the saved context/expression idea. Seems like it should be optional though, grouping elements with their own data (and ignoring the container's context) still seems useful.

monfera commented 5 years ago

Thanks @w33ble! Re the 2nd quote+comment, I phrased my statement sloppily, I didn't mean canvas expressions as indeed their execution is currently totally independent, but pipeline expressions - outside Canvas where I have low visibility :-)

Yes, it'd be optional, ie. a member of a group can start from scratch, I believe we don't need any special mechanism for doing that, as long as it starts with a filters | ... or essql ... it'll ignore the parent context. Which is nice because we don't need a toggle or other UI control.

In any event, this work thread is now on the back burner, the focus shifting on saving a node (an element group - or a single Canvas element!) - as savedObjects and the other way around.

w33ble commented 5 years ago

I didn't mean canvas expressions as indeed their execution is currently totally independent, but pipeline expressions - outside Canvas where I have low visibility :-)

Seems like there's some disagreement on terminology here. I've been using "Canvas expressions" and "pipelines" interchangably (opting for the former pretty much every time). Are "pipeline expressions" something different from Canvas expressions?

ppisljar commented 5 years ago

unless i am missing something, pipelines, canvas expressions, pipeline expressions .... they are all the same thing. There is no difference to the expressions. I think the only difference between visualize and canvas currently is the executor. Our is still super dumb and only handles rendering visualizations, ignores everything else.

monfera commented 5 years ago

We agreed to circling back to the implementation of this feature, this is what Phase 1 would entail:

Subsequent phases are planned to have directly related or pertinent, more general functions eg.

ryankeairns commented 5 years ago

This list looks good.

I would prefer the save be done via an icon button in the Settings panel, perhaps alongside the layer ordering controls. Doing a save from a keyboard shortcut feels too hidden and unintuitive.

nreese commented 1 year ago

Closing as not planned