GreptimeTeam / greptimedb

An open-source, cloud-native, unified time series database for metrics, logs and events with SQL/PromQL supported. Available on GreptimeCloud.
https://greptime.com/
Apache License 2.0
4.23k stars 303 forks source link

feat(fulltext_index): integrate puffin manager with inverted index applier #4266

Closed zhongzc closed 3 months ago

zhongzc commented 3 months ago

I hereby agree to the terms of the GreptimeDB CLA.

Refer to a related PR or issue link (optional)

4246

What's changed and what's your intention?

Checklist

Summary by CodeRabbit

coderabbitai[bot] commented 3 months ago

[!NOTE]

Reviews paused

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Walkthrough

The changes introduced revolve around enhancing the index management capabilities within the Mito engine, specifically by adding new configurations for staging, intermediate paths, and sizes. These updates include deprecating the intermediate_path of inverted_index, synchronized initialization processes, and incorporating a PuffinManagerFactory for better index handling and error logging.

Changes

Files Change Summaries
config/datanode.example.toml Added options for index management in the Mito engine: staging_path, staging_size, and intermediate_path. Deprecated intermediate_path.
config/standalone.example.toml Similar to datanode configuration, introduced new index options and deprecated older paths.
config/config.md Added documentation for new index configurations and marked certain paths as deprecated.
src/mito2/src/access_layer.rs Added PuffinManagerFactory to manage puffin-related operations, updated struct definitions and logic for explicit error handling.
src/mito2/src/cache/file_cache.rs Added #[allow(unused)] attribute to manage potential unused code warnings.
src/mito2/src/compaction/compactor.rs Incorporated PuffinManagerFactory, adjusted configuration usages to align with new index paths.
src/mito2/src/config.rs Introduced IndexConfig structure for managing index settings, refactored configurations to align with new paths.
src/mito2/src/error.rs Removed PuffinBlobTypeNotFound error, added errors for PuffinInitStager and PuffinBuildReader.
src/mito2/src/flush.rs Simplified how index_write_buffer_size is assigned.
src/mito2/src/read/scan_region.rs Modified ScanRegion initialization to include puffin_manager_factory.
src/mito2/src/region/opener.rs Added PuffinManagerFactory within RegionOpener, updated related methods and initialization processes.
src/mito2/src/sst/file_purger.rs Updated file purger logic to incorporate changes with PuffinManagerFactory.
src/mito2/src/sst/index.rs Added puffin_manager module for managing index operations.
src/mito2/src/sst/index/applier.rs Modified to use PuffinManager for index handling, adjusted error and file reading logic accordingly.
src/mito2/src/sst/index/applier/builder.rs Updated builder pattern to incorporate PuffinManagerFactory, added corresponding fields and methods.
src/mito2/src/test_util/scheduler_util.rs Updated the scheduler environment setup to use PuffinManagerFactory.
src/mito2/src/worker.rs Introduced PuffinManagerFactory for worker initialization, replaced old configurations accordingly.
src/mito2/src/worker/handle_catchup.rs Updated RegionWorkerLoop to include puffin_manager_factory.
src/mito2/src/worker/handle_create.rs Added puffin_manager_factory to RegionWorkerLoop constructor dependencies.
src/mito2/src/worker/handle_open.rs Included puffin_manager_factory in the initialization process of RegionWorkerLoop.
src/puffin/src/error.rs Added External variant to Error enum with corresponding status_code method handling.
src/puffin/src/puffin_manager/stager.rs Added FsBlobGuard and FsDirGuard to public exports, introduced a new BoundedStager async function for directory creation.
tests-integration/tests/http.rs Adjusted test configurations to incorporate new index settings for the Mito engine.

Poem

In a warren deep, our code took flight,
Indices anew, in Mito's light. 🌟
Paths and sizes staged with care,
Handling errors in software's lair.
Puffin guides our files to dance,
In this enhanced engine's advance. 🚀
CodeRabbit cheers with a joyful glance! 🐰


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share - [X](https://twitter.com/intent/tweet?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A&url=https%3A//coderabbit.ai) - [Mastodon](https://mastodon.social/share?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A%20https%3A%2F%2Fcoderabbit.ai) - [Reddit](https://www.reddit.com/submit?title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&text=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code.%20Check%20it%20out%3A%20https%3A//coderabbit.ai) - [LinkedIn](https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fcoderabbit.ai&mini=true&title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&summary=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code)
Tips ### Chat There are 3 ways to chat with [CodeRabbit](https://coderabbit.ai): - Review comments: Directly reply to a review comment made by CodeRabbit. Example: - `I pushed a fix in commit .` - `Generate unit testing code for this file.` - `Open a follow-up GitHub issue for this discussion.` - Files and specific lines of code (under the "Files changed" tab): Tag `@coderabbitai` in a new review comment at the desired location with your query. Examples: - `@coderabbitai generate unit testing code for this file.` - `@coderabbitai modularize this function.` - PR comments: Tag `@coderabbitai` in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples: - `@coderabbitai generate interesting stats about this repository and render them as a table.` - `@coderabbitai show all the console.log statements in this repository.` - `@coderabbitai read src/utils.ts and generate unit testing code.` - `@coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.` - `@coderabbitai help me debug CodeRabbit configuration file.` Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. ### CodeRabbit Commands (invoked as PR comments) - `@coderabbitai pause` to pause the reviews on a PR. - `@coderabbitai resume` to resume the paused reviews. - `@coderabbitai review` to trigger an incremental review. This is useful when automatic reviews are disabled for the repository. - `@coderabbitai full review` to do a full review from scratch and review all the files again. - `@coderabbitai summary` to regenerate the summary of the PR. - `@coderabbitai resolve` resolve all the CodeRabbit review comments. - `@coderabbitai configuration` to show the current CodeRabbit configuration for the repository. - `@coderabbitai help` to get help. Additionally, you can add `@coderabbitai ignore` anywhere in the PR description to prevent this PR from being reviewed. ### CodeRabbit Configration File (`.coderabbit.yaml`) - You can programmatically configure CodeRabbit by adding a `.coderabbit.yaml` file to the root of your repository. - Please see the [configuration documentation](https://docs.coderabbit.ai/guides/configure-coderabbit) for more information. - If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: `# yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json` ### Documentation and Community - Visit our [Documentation](https://coderabbit.ai/docs) for detailed information on how to use CodeRabbit. - Join our [Discord Community](https://discord.com/invite/GsXnASn26c) to get help, request features, and share feedback. - Follow us on [X/Twitter](https://twitter.com/coderabbitai) for updates and announcements.
zhongzc commented 3 months ago

@coderabbitai pause

coderabbitai[bot] commented 3 months ago
Actions performed Reviews paused.
codecov[bot] commented 3 months ago

Codecov Report

Attention: Patch coverage is 93.52332% with 25 lines in your changes missing coverage. Please review.

Project coverage is 84.68%. Comparing base (0f4b9e5) to head (0982167). Report is 7 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #4266 +/- ## ========================================== - Coverage 84.97% 84.68% -0.29% ========================================== Files 1054 1058 +4 Lines 187160 187758 +598 ========================================== - Hits 159031 158996 -35 - Misses 28129 28762 +633 ```
killme2008 commented 3 months ago

What about aux_path? @zhongzc auxiliary_path is so long!