prestodb / presto

The official home of the Presto distributed SQL query engine for big data
http://prestodb.io
Apache License 2.0
16.01k stars 5.36k forks source link

Support for aggregate pushdown #4839

Open buremba opened 8 years ago

buremba commented 8 years ago

Currently, Presto uses predicate pushdown technique which can dramatically affect the query performance by skipping shards which don't include any data that passes from the predicate. I think aggregate pushdown also would be a great feature for Presto since it's an interactive analytic query engine and AFAIK people usually execute aggregation queries on Presto.

Raptor connector already pre-calculates row count of shards, maximum and minimum of each column for each shard in order to support predicate pushdown. When we execute a query similar to SELECT count(*), sum(col1) FROM raptor.schema.table WHERE col2 between 100 and 200 the connector filters shards that contain only specified range of col2 column but still needs to process the filtered shards in order to find the sum of col1 and the count of rows. However we already have that data in metadata database so instead of processing the data every time when a query similar to this one is executed, we may just return the pre-calculated data.

The problem with aggregate pushdown is that the values must be pre-calculated so currently it will work only for simple aggregation queries; however it's also true for predicate pushdown so I don't think that it's a big issue. On the other hand, it may make it Presto much more interactive because it can greatly reduce the query execution time for specific queries.

The other connectors such as jdbc-connector can also take advantage of this feature because it may greatly reduce the network traffic. It can also make it possible to develop connectors that do pre-aggregation or lazy-aggregation. AFAIK, some low-latency OLAP databases such as Druid and Pinot uses pre-aggregation to optimize latency.

dain commented 8 years ago

There are two distinct approaches that can be taken here:

  1. Materialized query rewrites: there are known queries that have been precomputed and stored. The engine detects a queries that can be rewritten in terms of a materialized result. This is what you have described with Raptor, and what Presto already can do for Hive partition keys (all be it in a very limited way).
  2. Push aggregation functions down to connectors. This is a more generic form that could be used with connectors like JDBC.

We are working on a generic version of the first one, and you can see progress in #4449 and #3864 (cc @joy-yao). I doubt we will start on the second one until the new planner is built and integrated.

buremba commented 8 years ago

Materialized views are useful but you need to re-write full table every time you want to update the result AFAIK.

In practice, we usually need incremental computation so this makes pre-aggregation much more complex. Therefore I think it's best for aggregate pushdown to work in a way similar to predicate pushdown so 2 seems to be better choice for me since the connectors can have their own pre-aggregation implementation.

For example, Raptor uses immutable shards internally. We may implement pre-aggregation in shard level and generate pre-aggregated data when writing shards. We can even use intermediate aggregations to merge pre-aggregated data when shards are compacted.

Connectors for Druid and Pinot may use their implementation since they already support pre-aggregation so there's no need to pre-aggregate data in Presto.

Connectors like jdbc-connector can generate SQL and execute aggregation query on backend database instead of returning full table data.

I'm aware that this will take time but I wonder if there's an interest on making Presto more interactive by implementing features similar to this one.

dain commented 8 years ago

Um, I don't think I'm getting my point across.

In the case of Raptor it is calculating the min/max/count/is_nullable for every column per shard when writing a shard. You can think of this as materializing the query:

SELECT shard_uuid, count(), min(x), max(x), count(x)
FROM my_raptor_table
GROUP BY 1

This is what I mean by case 1. The connector would say "BTW, I have the following queries precomputed and always up to date". This same design would work for Druid since it knows which aggregations it has precomputed.

This is very different from case 2, where the connector says, "I can run the following aggregations with the exact same semantics as Presto". This option is more flexible, but also possibly more restrictive as many connectors can not guarantee the same semantic as Presto (I'm looking at you, MySQL). Also, FWIU, this option also requires a bunch of changes to Presto.

buremba commented 8 years ago

OK, now I get it. It seems like a clever solution which doesn't require much work compared to second option but still powerful as it is. Practically, it's similar to aggregate pushdown but it seems that connectors that use jdbc-connector can't take advantage of this feature in this case. Anyway, I didn't actually think of the consistency problem that you mentioned so that's fair.

I think the pre-calculated query table should be similar to this one in for Raptor connector SELECT count(), min(x), max(x) FROM my_raptor_table because the planner does not aware that $shard_uuid is a special column so it cannot re-write the plan since $shard_uuid is not referenced in query that the user executes.

However; FWIU the queries must somehow be dynamic otherwise the connector needs to change the query when a new column is added, removed etc. Also the query must be an aggregation query and some operations such as WINDOW, ORDER BY should be disabled since they're not relevant. Please correct me if I'm wrong.

Downchuck commented 7 years ago

Is there another issue report tracking work on this item, aggregate pushdown?

gitcloned commented 5 years ago

Can simple aggregate functions push down be supported in Presto?

This can improve performance of some of the simple queries, look at this query for reference: https://stackoverflow.com/questions/53397707/presto-running-slower-than-sql-server

dain commented 5 years ago

Can simple aggregate functions push down be supported in Presto?

Currently, there is no support for aggregation function (or any function) push down. It is something we would like to do in the future, but there are a few changes that will be needed before this can be implemented.

buremba commented 5 years ago

@gitcloned, one workaround is to create views in the source database (SQL Server in your case) and query those views from Presto.

gitcloned commented 5 years ago

Can simple aggregate functions push down be supported in Presto?

Currently, there is no support for aggregation function (or any function) push down. It is something we would like to do in the future, but there are a few changes that will be needed before this can be implemented.

Thanks for update, would definitely be helpful if aggregate push down can be supported

highker commented 5 years ago

@hellium01, given #12920 is merged, feel free to move project/sample/limit/union/... nodes to SPI. Aggregation and join nodes may need some extra thinking though the dependency is clean. It's just the nodes are too complicated as an interface.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had any activity in the last 2 years. If you feel that this issue is important, just comment and the stale tag will be removed; otherwise it will be closed in 7 days. This is an attempt to ensure that our open issues remain valuable and relevant so that we can keep track of what needs to be done and prioritize the right things.

gitcloned commented 3 years ago

Would be an amazing feature to have with presto db.. commenting to have discussion keep going ..