koralium / flowtide

Streaming integration engine
https://koralium.github.io/flowtide/
Apache License 2.0
27 stars 2 forks source link

Investigate changing to column store #486

Closed Ulimo closed 1 week ago

Ulimo commented 2 months ago

It should be investigated if changing to column store can give better performance.

Changing would be a major refactor, but it would be worth it if the performance benefits are large enough.

Ulimo commented 2 months ago

The following benchmarks have been produced when investigating column store:

Before:

| Method   | Mean     | Error    | StdDev   |
|--------- |---------:|---------:|---------:|
| LeftJoin | 944.8 ms | 77.82 ms | 46.31 ms |

With column store, left join:

| Method   | Mean     | Error    | StdDev   |
|--------- |---------:|---------:|---------:|
| LeftJoin | 552.4 ms | 36.64 ms | 21.80 ms |

In this benchmark only the join code has been replaced with column store and not the normalization operator or the read operator. So the times should be possible to reduce further.