Open drewdaemon opened 1 year ago
Pinging @elastic/kibana-visualizations @elastic/kibana-visualizations-external (Team:Visualizations)
Thank you @andrewctate for starting this. Those terms describe a mix of different operations that I agree can definitely be cleaned up and aligned.
breakdown / breakdownBy
: Lens defines one more "bucket" creating a new hierarchical level in your data. Then it describes the dimension with colors.split
: define and compute an additional metric and ask the chart to render it on the same chart. It usually associates a color for each new metric added.split
on small multiples by metrics: define and compute an additional metric and renders it on a different panel.split
on small multiples by dimension: add an additional dimension to your dataset (subdividing it) and render a metric into multiple panel, one for each dimension.bucket
: is really tied to Elasticsearch naming but it refers to two different functions:
binning
is used to divide your data into sets by partitioning a numerical space (field) into ordered binsgroup by
is used to divide your data into sets with the same categorical
field value. The field could also be a numerical or date field, but the operation is the same: your data is collected in sets relative to the same field value.Kibana/Lens/TSVB etc mix a bit the data operations with how/where to assign a dimension/metric:
Starting by aligning this on code and moving it to the UI could be a good move. Also elastic/charts should be realigned, because, since inception, we kept and ported most of the preexisting semantics from Kibana and we should finally remove those wrong concepts, promoting a better semantic structure
Thanks for chipping in here @markov00 . Really good analysis.
Starting by aligning this on code and moving it to the UI could be a good move.
IMO, a great place to start would be getting some consensus from all stakeholders (developers, product people, docs team, and designers) with respect to these terms. I've noticed that we often have a "working term" we use as developers. That term gets used all over the code and gets ingrained into our minds. Then we get asked to change the name of whatever it is as part of the product/design review process. Then, one of two things happens
Edit: though I guess I can think of scenarios where it could make sense to have our own term as developers that doesn't match exactly what is in the UI... for example, even if we agree that that "breakdown by" is what we'll use to describe bucketing a dimension and describing it with colors, there's an argument to continue using the term "slice by" in the pie chart UI.
Right now, "split," "breakdown," and "bucket" are used inconsistently. We could reduce cognitive load by following the principle of DDD and agreeing on a common language. This will become especially helpful as we look at introducing small multiples in Lens which will add yet another similar concept.
Today
"split"
"breakdown" / "breakdownBy" - dividing data and displaying on single chart
"bucket" - dividing data and displaying on a single chart. Maybe implies the underlying use of an Elasticsearch aggregation? Maybe implies time-series data?
Once we agree on these, we could update the code to match our agreed definitions and enforce this for new code, making things much more understandable. I don't want to ask myself what one of these means when I'm squinting at someone else's logic.