Qbeast-io / qbeast-spark

Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!
https://qbeast.io/qbeast-our-tech/
Apache License 2.0
202 stars 18 forks source link

Issue 297: Reduce overhead for CubeDomainsBuilder instantiation #298

Closed Jiaweihu08 closed 2 months ago

Jiaweihu08 commented 3 months ago

This PR Fixes #297 by

  1. As the overhead comes from CubeStatus, we pass the required objects directly instead of IndexStatus
  2. Use Broadcast so they persist in each worker so we don't have to pass them for each task
codecov[bot] commented 3 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 90.74%. Comparing base (065a6b2) to head (b0a65a8).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #298 +/- ## ========================================== - Coverage 90.76% 90.74% -0.03% ========================================== Files 98 98 Lines 2599 2592 -7 Branches 346 341 -5 ========================================== - Hits 2359 2352 -7 Misses 240 240 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

osopardo1 commented 2 months ago

Seems that the change in the code is not affecting the structure of the index in larger appends.

When @fpj adds his review, I will wrap up and prepare the merge.