vmware-archive / database-stream-processor

Streaming and Incremental Computation Framework
Other
222 stars 20 forks source link

Reversible cursors #387

Closed ryzhyk closed 1 year ago

ryzhyk commented 1 year ago

We generalize cursor traits to iterate and search in both directions.

This is a step toward introducing support for SQL operators that manipulate ordered groups of rows, including ORDER BY ASC/DESC, RANK, ROW_NUMBER, top-k, etc. These operators require the ability to enumerate rows in both ascending and descending orders. Our cursors did not support that as they only allowed iterating and seeking forward.

This commit extends cursor traits with methods to iterate and seek backward. A cursor is always created pointing to the first record. In order to iterate backward, the client must first move the cursor to the end of the range using the new fast_forward() method.

For batch and trace cursors we allow reverse iteration by both keys and values.

So far the new infrastructure is only used to implement the Max aggregate more efficiently. Upcoming commits will make a more extensive use of it.

codecov[bot] commented 1 year ago

Codecov Report

Merging #387 (da13020) into main (34ea1f3) will increase coverage by 0.31%. The diff coverage is 77.47%.

Additional details and impacted files [![Impacted file tree graph](https://codecov.io/gh/vmware/database-stream-processor/pull/387/graphs/tree.svg?width=650&height=150&src=pr&token=0wZcmD11gt&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=vmware)](https://codecov.io/gh/vmware/database-stream-processor/pull/387?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=vmware) ```diff @@ Coverage Diff @@ ## main #387 +/- ## ========================================== + Coverage 73.85% 74.16% +0.31% ========================================== Files 238 237 -1 Lines 51586 52295 +709 ========================================== + Hits 38097 38786 +689 - Misses 13489 13509 +20 ``` | [Impacted Files](https://codecov.io/gh/vmware/database-stream-processor/pull/387?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=vmware) | Coverage Δ | | |---|---|---| | [...rates/dbsp/src/operator/time\_series/partitioned.rs](https://codecov.io/gh/vmware/database-stream-processor/pull/387?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=vmware#diff-Y3JhdGVzL2Ric3Avc3JjL29wZXJhdG9yL3RpbWVfc2VyaWVzL3BhcnRpdGlvbmVkLnJz) | `49.50% <0.00%> (-20.92%)` | :arrow_down: | | [crates/dbsp/src/operator/time\_series/range.rs](https://codecov.io/gh/vmware/database-stream-processor/pull/387?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=vmware#diff-Y3JhdGVzL2Ric3Avc3JjL29wZXJhdG9yL3RpbWVfc2VyaWVzL3JhbmdlLnJz) | `70.83% <0.00%> (-7.63%)` | :arrow_down: | | [crates/dbsp/src/trace/cursor/mod.rs](https://codecov.io/gh/vmware/database-stream-processor/pull/387?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=vmware#diff-Y3JhdGVzL2Ric3Avc3JjL3RyYWNlL2N1cnNvci9tb2QucnM=) | `54.09% <ø> (+8.89%)` | :arrow_up: | | [crates/dbsp/src/trace/layers/erased/cursor.rs](https://codecov.io/gh/vmware/database-stream-processor/pull/387?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=vmware#diff-Y3JhdGVzL2Ric3Avc3JjL3RyYWNlL2xheWVycy9lcmFzZWQvY3Vyc29yLnJz) | `0.00% <0.00%> (ø)` | | | [crates/dbsp/src/trace/layers/mod.rs](https://codecov.io/gh/vmware/database-stream-processor/pull/387?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=vmware#diff-Y3JhdGVzL2Ric3Avc3JjL3RyYWNlL2xheWVycy9tb2QucnM=) | `34.88% <0.00%> (ø)` | | | [crates/dbsp/src/trace/layers/unordered.rs](https://codecov.io/gh/vmware/database-stream-processor/pull/387?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=vmware#diff-Y3JhdGVzL2Ric3Avc3JjL3RyYWNlL2xheWVycy91bm9yZGVyZWQucnM=) | `0.00% <0.00%> (ø)` | | | [crates/dbsp/src/trace/mod.rs](https://codecov.io/gh/vmware/database-stream-processor/pull/387?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=vmware#diff-Y3JhdGVzL2Ric3Avc3JjL3RyYWNlL21vZC5ycw==) | `81.81% <ø> (+6.81%)` | :arrow_up: | | [...me\_series/radix\_tree/partitioned\_tree\_aggregate.rs](https://codecov.io/gh/vmware/database-stream-processor/pull/387?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=vmware#diff-Y3JhdGVzL2Ric3Avc3JjL29wZXJhdG9yL3RpbWVfc2VyaWVzL3JhZGl4X3RyZWUvcGFydGl0aW9uZWRfdHJlZV9hZ2dyZWdhdGUucnM=) | `89.90% <13.33%> (-1.52%)` | :arrow_down: | | [crates/dbsp/src/trace/cursor/cursor\_group.rs](https://codecov.io/gh/vmware/database-stream-processor/pull/387?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=vmware#diff-Y3JhdGVzL2Ric3Avc3JjL3RyYWNlL2N1cnNvci9jdXJzb3JfZ3JvdXAucnM=) | `43.68% <32.00%> (-3.88%)` | :arrow_down: | | [crates/dbsp/src/trace/ord/key\_batch.rs](https://codecov.io/gh/vmware/database-stream-processor/pull/387?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=vmware#diff-Y3JhdGVzL2Ric3Avc3JjL3RyYWNlL29yZC9rZXlfYmF0Y2gucnM=) | `42.47% <53.84%> (+1.38%)` | :arrow_up: | | ... and [20 more](https://codecov.io/gh/vmware/database-stream-processor/pull/387?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=vmware) | | ... and [4 files with indirect coverage changes](https://codecov.io/gh/vmware/database-stream-processor/pull/387/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=vmware)
github-actions[bot] commented 1 year ago

Benchmark results

Nexmark

name main~57 [kOp/s] PR [kOp/s] Tput change [%] Assessment Peak RSS diff
q0 5262.63 5424.78 3 :heavy_check_mark: 100.2 MB
q1 5429.6 5315.27 -2 :heavy_check_mark: 60.7 MB
q2 5361.6 5410.62 1 :heavy_check_mark: 30.9 MB
q3 5321.94 5400.76 1 :heavy_check_mark: 27.2 MB
q4 4802.55 4679.08 -3 :heavy_check_mark: -184.3 MB
q5 5378.33 5382.24 0 :heavy_check_mark: -184.3 MB
q6 4880.13 4869.46 0 :heavy_check_mark: -441.5 MB
q7 2559.31 3590.18 40 :evergreen_tree: -5.0 GB
q8 5161.2 5071.04 -2 :heavy_check_mark: -5.0 GB
q9 875.448 886.936 1 :heavy_check_mark: 823.2 MB
q12 4910.28 5015.09 2 :heavy_check_mark: 823.2 MB
q13 3645.17 3568.85 -2 :heavy_check_mark: 823.2 MB
q14 5481.43 5183.27 -5 :small_red_triangle_down: 823.2 MB
q15 4895.11 5260.02 7 :evergreen_tree: 823.2 MB
q16 1045.16 1037.27 -1 :heavy_check_mark: 823.2 MB
q17 3134.56 3238.35 3 :heavy_check_mark: -174.2 MB
q18 1440.13 1422.96 -1 :heavy_check_mark: 448.5 MB
q19 1435.15 1419.73 -1 :heavy_check_mark: 448.5 MB
q20 1500.83 1602.24 7 :evergreen_tree: 448.5 MB
q21 5117.36 5199.05 2 :heavy_check_mark: 448.5 MB
q22 5356.59 5206.13 -3 :heavy_check_mark: 448.5 MB

Galen

name main~57 [s] PR [s] Runtime change [%] Assessment
galen 28.6952 28.9934 1 :heavy_check_mark:

LDBC

algorithm dataset threads main~57 [kEVPS] PR [kEVPS] Tput change [%] Assessment Peak RSS diff
bfs graph500-22 1 1835.7 1655.34 -10 :small_red_triangle_down: 139.3 kB
bfs datagen-8_4-fb 6 8173.61 7431.98 -9 :small_red_triangle_down: 19.4 MB
pagerank graph500-22 1 681.655 685.537 1 :heavy_check_mark: -139.3 kB
pagerank datagen-8_4-fb 6 1994.79 1924.24 -4 :heavy_check_mark: -139.8 MB

Nexmark (with Persistence)

name main~57 [kOp/s] PR [kOp/s] Tput change [%] PR DRAM [kOp/s] DRAM diff [%] Assessment
q0 2399.23 2438.81 2 2347.31 4 :heavy_check_mark:
q1 1713.93 1684.98 -2 1696.11 -1 :heavy_check_mark:
q2 2407.67 2442.19 1 2340.39 4 :heavy_check_mark:
q3 2036.93 2070.01 2 2229.78 -7 :heavy_check_mark:
q4 360.256 367.827 2 1400.79 -74 :heavy_check_mark:
q5 2015.3 2059.29 2 2236.53 -8 :heavy_check_mark:
q6 332.421 338.097 2 1350.92 -75 :heavy_check_mark:
q7 639.584 642.884 1 1300.07 -51 :heavy_check_mark:
q8 2171.19 2280.55 5 2212.41 3 :evergreen_tree:
q9 79.0764 80.3211 2 379.851 -79 :heavy_check_mark:
q12 838.481 859.302 2 1761.01 -51 :heavy_check_mark:
q13 443.491 439.595 -1 984.5 -55 :heavy_check_mark:
q14 1697.52 1701.71 0 1680.33 1 :heavy_check_mark:
q15 198.112 198.451 0 1190.13 -83 :heavy_check_mark:
q16 25.5771 25.6591 0 283.355 -91 :heavy_check_mark:
q17 81.4521 80.8681 -1 801.533 -90 :heavy_check_mark:
q18 124.763 124.664 0 805.042 -85 :heavy_check_mark:
q19 182.489 181.938 0 645.582 -72 :heavy_check_mark:
q20 505.761 502.819 -1 935.298 -46 :heavy_check_mark:
q21 1489.8 1517.74 2 1485.02 2 :heavy_check_mark:
q22 2102.89 2107.33 0 2042.01 3 :heavy_check_mark: