evidence-dev / evidence

Business intelligence as code: build fast, interactive data visualizations in pure SQL and markdown
https://evidence.dev
MIT License
4.24k stars 200 forks source link

Allow for "direct query" style queries #1725

Closed creativedutchmen closed 1 month ago

creativedutchmen commented 6 months ago

Feature Description

While the choice of pre-loading all the data together with DuckDB makes the experience of using evidence quite snappy, it also means that some types of dashboard are not viable - simply because they would require loading way too much data. It would be great if it were possible to allow some queries to hit the database directly.

While it might seems undesirable, the number of OLAP databases that could handle this type of workload has exploded, and it would make the deployment easier and solution space (much) larger.

Goal of Feature

Expand the solution space from pre-aggregated data that fits easily into memory to the full set of (ad-hoc) questions one could ask from the data. For instance, drill downs per product, country, membership status, ...

Current Solution / Workarounds

Currently we need to have a ton of queries that precompute the drill downs we think are interesting, and disallow more than a single variable in these drill downs (ie sales per brand, sales per country, sales per channel)

Alternatives

Use evidence as a nice looking (very basic) email report, and use another tool for the deep-dives. Super unsatisfying, because it would require multiple tools again (and those tools can usually also send emails)

creativedutchmen commented 6 months ago

Apart from the cases where the current flow is simply not feasible, it would also make reports on large tables more editor-friendly.

For instance, imagine that the orders table in the example project is billions of rows. You wouldn't be able to do select * from orders anymore, and instead would have to use a large rollup table (which might be too big, still) or flat tables per query (daily_orders_per_country.sql)

mcrascal commented 6 months ago

I like this. At the moment we're focused on building out what you can do in the client side run time. For example, more interactive components like pivot tables and the recent dimension grid, as well as increasing the interactive performance from supporting source tables with ~1 million records to 10-20 million, but at some point we will likely explore pushing queries down to the data warehouse.