Parallel query execution

pmenon commented 6 years ago

This is a PR for parallel query execution in codegen.

Supported parallel operators:
- [x] Sequential scans
- [x] Sorting
- [x] Hash joins
- [ ] Aggregations
- [ ] Inserts (purposely disabled until we have COPY)
- [ ] Updates
- [ ] Deletes
Parallelism is done in fork-join manner.
Settings added to toggle parallel execution, and set the minimum table size threshold before parallel execution kicks in. This logic should be modified.
- @chenboy Check plan_generator to see if the logic looks good.
- I've split up RuntimeState into QueryState and PipelineState.
- QueryState exists for the lifetime of the query. It is initialized in the init() function, and torn down in the tearDown() function.
- PipelineState exists for only one pipeline. It is created on entry to the pipeline function, and is torn down upon exit.
- I've cleaned up proxies to simplify loading/store struct member variables using names rather than index positions.
- What used to be codegen.CreateGEP(..., 0, 2) to load struct elements is now codegen.Load(HashTableProxy::directory, ...).
- Similarly done for storing elements of a struct: codegen.Store(...)
Removed CCHashTable, added generic HashTable that will eventually be used for both joins and aggregations.

Review notes:

Most of the heavy lifting is done in pipeline.cpp. Pipelines can be run serially or in parallel.
- This means all operators need to pass a std::function when launching pipelines.
Many of the changed files are in the proxies, making is simpler to create them.

TODO:

~~- Remove std::mutex from buffer output~~ (This will be another PR) ~~- Enable parallel aggregations~~ (This will be another PR when parallel agg team is done) ~~- Add tests for memory leak~~ (Done)

coveralls commented 6 years ago

Coverage decreased (-0.2%) to 77.406% when pulling 1a70093a0eeccbe50edd2a92ad17c066c97a4a34 on pmenon:mt into 65915234edf5c33acda078f632012887291d22b7 on cmu-db:master.

pmenon commented 6 years ago

@tcm-marcel Want to get started on this review? It's quite big, so I don't want to leave it un-merged for too long.

pmenon commented 6 years ago

@tcm-marcel A small final change, then it's good to go.

cmu-db / peloton

Parallel query execution #1304

Review notes:

TODO: