databendlabs / databend

๐——๐—ฎ๐˜๐—ฎ, ๐—”๐—ป๐—ฎ๐—น๐˜†๐˜๐—ถ๐—ฐ๐˜€ & ๐—”๐—œ. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
https://docs.databend.com
Other
7.71k stars 732 forks source link

fix: split if block too big during append #16435

Closed zhyass closed 1 week ago

zhyass commented 2 weeks ago

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

  1. Refactoring TransformCompact:

    • The TransformCompact was split into two components to improve the modularity and efficiency of the compaction process:
      • BlockCompactBuilder: Responsible for constructing compaction tasks.
      • TransformCompactBlock: Executes the actual compaction in a parallelized manner.
  2. Improvement in Compaction Logic:

    • The logic was adjusted to avoid writing excessively large data blocks during compaction.
    • The new structure ensures compaction is done as early as possible during the data writing phase.
  3. Block Size Control:

    • The changes aim to fine-tune block sizes during the compaction, ensuring that the resulting blocks are neither too small nor too large, which can impact the efficiency of both storage and read operations.
  4. Replace HashMap with BTreeMap in reclustering fetch_max_depth, for stable reclustering effects

  5. Compact source data blocks before reclustering, for better performance and clustering

Tests

Type of change


This change isโ€‚Reviewable

github-actions[bot] commented 2 weeks ago

At least one test kind must be checked in the PR description. @zhyass please update it ๐Ÿ™.

github-actions[bot] commented 2 weeks ago

At least one test kind must be checked in the PR description. @zhyass please update it ๐Ÿ™.

github-actions[bot] commented 2 weeks ago

Docker Image for PR

note: this image tag is only available for internal use, please check the internal doc for more details.

github-actions[bot] commented 2 weeks ago

ClickBench Report

github-actions[bot] commented 2 weeks ago

Docker Image for PR

note: this image tag is only available for internal use, please check the internal doc for more details.

github-actions[bot] commented 2 weeks ago

ClickBench Report

github-actions[bot] commented 2 weeks ago

Docker Image for PR

note: this image tag is only available for internal use, please check the internal doc for more details.

github-actions[bot] commented 2 weeks ago

ClickBench Report

github-actions[bot] commented 2 weeks ago

Docker Image for PR

note: this image tag is only available for internal use, please check the internal doc for more details.

github-actions[bot] commented 2 weeks ago

ClickBench Report