risinglightdb / risinglight

An educational OLAP database system.
Apache License 2.0
1.61k stars 214 forks source link

refactor(storage): use dynamic trait for storage interfaces #846

Closed wangrunji0408 closed 4 months ago

wangrunji0408 commented 6 months ago

Signed-off-by: Runji Wang wangrunji0408@163.com

Dynamic async traits are not so bad for performance, as long as they are not called frequently. The advantage is that they are less invasive as they do not require generic parameters everywhere, and they allow for extensible plugins in the future.

This PR uses #[async_trait] to refactor storage traits into dyn traits. It removes <S: Storage> from executors. The only performance critical part is the TxnIterator::next_batch which is called frequently in scanning. We refactor it into a BoxStream to eliminate boxing the future on each call.

Performance comparison on TPCH |tpch|time| |----|------| |q1| +5.9%| |q17| +5.6%| |q18| +5.4%| |q13| +4.1%| |q11| +3.9%| |q14| +3.9%| |q7| +3.7%| |q4| +3.2%| |q3| +3.1%| |q10| +2.2%| |q9| +1.8%| |q5| +1.4%| |q15| +1.2%| |q16| +1.2%| |q19| +1.2%| |q6| +0.4%| |q12| -0.4%| |q20| -0.6%| |q22| -2.5%| |q2| -2.9%| |q8| -2.9%| |q21| -5.4%|
TennyZhuang commented 6 months ago

Weakly -1

The current dynamic traits have too many limitations, making them seem like unfinished products.

And pass generic parameters are not really very painful.

wangrunji0408 commented 6 months ago

also objected by @mrcroxx

MrCroxx commented 6 months ago

also objected by @MrCroxx

The #[async_trait] allocates a boxed future on each call. The overhead can be high for memory-only path. e.g. cache hits.

skyzh commented 4 months ago

Given that the storage part of the system runs in batches (i.e., we always fetch as many as 1024 rows if possible), I don't think using a dynamic trait would cause a large regression. However, I don't see the code is simplified significantly without static dispatch, except that the S: Storage thing gets removed. Therefore, I'm kind of in the middle of approving or turning down this pull request :(