apache / doris

Apache Doris is an easy-to-use, high performance and unified analytics database.
https://doris.apache.org
Apache License 2.0
12.5k stars 3.24k forks source link

[Refactor] Refactor IO stack #9122

Open platoneko opened 2 years ago

platoneko commented 2 years ago

Search before asking

Description

For more details see https://cwiki.apache.org/confluence/display/DORIS/DSIP-006%3A+Refactor+IO+stack

Use case

No response

Related issues

No response

Are you willing to submit PR?

Code of Conduct

platoneko commented 2 years ago

Currently, Doris IO related code dependencies: Rowset -> BlockManager -> ReadableBlock/WritableBlock -> Env Scanner -> FileReader FileResultWriter -> FileWriter

Expected dependencies: Rowset -> FileSystem -> ReadStream/WriteStream (different Rowset may use different FileSystem backend) Scanner -> ReadStream FileResultWriter -> WriteStream

FileSystem provides APIs for directory and file management, and may manage the file cache in the future. ReadStream/WriteStream may contain buffer and prefetch data in parallel.