trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.48k stars 3.02k forks source link

Support remote exchanges in LocalQueryRunner #275

Open martint opened 5 years ago

martint commented 5 years ago

We currently have two mechanisms for running queries in unit tests: LocalQueryRunner and DistributedQueryRunner. LocalQueryRunner is faster to set up as it doesn't involve running a full server & exchange clients, but can only execute queries with a single fragment. The downside is that it's not suitable for testing behaviors that only manifest when the optimizer inserts remote exchanges into the plan. As a result, all the tests that rely on LocalQueryRunner are missing opportunities to catch correctness issues due to bad plans

We should:

martint commented 5 years ago

These are examples of queries that only fails when remote exchanges are involved, so it's not possible to test them with LocalQueryRunner (and therefore, in the style of io.prestosql.sql.query.TestXXX): https://github.com/prestosql/presto/pull/262#discussion_r258687795

findepi commented 5 years ago

Another unique feature of LocalQueryRunner is that it's single-threaded. Are you going to change this too?

(This has obvious downsides but also has an upside -- if you make one of the Join operators block on the other during execution, the LocalQueryRunner will (rightfully) deadlock.)

dain commented 5 years ago

Another unique feature of LocalQueryRunner is that it's single-threaded. Are you going to change this too?

I don't think we should change that. The distributed tests are there to catch those kinds of issues. BTW, the local query runner "works around" the deadlock problem by just ignoring all of the blocked signals.