dolthub / dolt

Dolt – Git for Data
Apache License 2.0
17.82k stars 504 forks source link

[prolly] Float keyRange increment bug #8025

Closed max-hoffman closed 3 months ago

max-hoffman commented 3 months ago

Incrementing the [n, n+1) key range is a lot faster than a binary search with a tuple comparison callback. But it is subject to at least two edge cases where (n+1) is not a valid stop range: (1) n+1 == n, because of precision loss, and (2) n+1 < n, because of overflow.

I added a series of GMS tests here: https://github.com/dolthub/go-mysql-server/pull/2554. I couldn't find a DECIMAL failure case, I think DECIMAL always encodes a valid 1's place, and is not subject to overflow AFAICT.

coffeegoddd commented 3 months ago

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000
version result total
7fa23ad ok 5937457
version total_tests
7fa23ad 5937457
correctness_percentage
100.0
github-actions[bot] commented 3 months ago

Additional work is required for integration with DoltgreSQL.

github-actions[bot] commented 3 months ago

Additional work is required for integration with DoltgreSQL.

coffeegoddd commented 3 months ago

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000
version result total
de5000c ok 5937457
version total_tests
de5000c 5937457
correctness_percentage
100.0
coffeegoddd commented 3 months ago

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000
version result total
2a8d25b ok 5937457
version total_tests
2a8d25b 5937457
correctness_percentage
100.0
coffeegoddd commented 3 months ago

@coffeegoddd DOLT

comparing_percentages
100.000000 to 100.000000
version result total
8b3f6fc ok 5937457
version total_tests
8b3f6fc 5937457
correctness_percentage
100.0
coffeegoddd commented 3 months ago

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000
version result total
731b74b ok 5937457
version total_tests
731b74b 5937457
correctness_percentage
100.0
coffeegoddd commented 3 months ago

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000
version result total
3124cf1 ok 5937457
version total_tests
3124cf1 5937457
correctness_percentage
100.0
github-actions[bot] commented 3 months ago
@coffeegoddd DOLT test_name detail row_cnt sorted mysql_time sql_mult cli_mult
batching LOAD DATA 10000 1 0.05 1.6
batching batch sql 10000 1 0.07 2
batching by line sql 10000 1 0.08 1.75
blob 1 blob 200000 1 0.86 4.03 3.81
blob 2 blobs 200000 1 0.88 4.47 4.44
blob no blob 200000 1 0.85 2.55 2.18
col type datetime 200000 1 0.8 3.03 2.85
col type varchar 200000 1 0.68 3.41 2.81
config width 2 cols 200000 1 0.78 2.9 2.17
config width 32 cols 200000 1 1.83 1.97 2.66
config width 8 cols 200000 1 0.93 2.47 2.34
pk type float 200000 1 0.85 2.36 2.04
pk type int 200000 1 0.77 2.55 2.21
pk type varchar 200000 1 1.52 1.66 1.44
row count 1.6mm 1600000 1 5.56 2.95 2.55
row count 400k 400000 1 1.4 2.9 2.44
row count 800k 800000 1 2.81 2.9 2.48
secondary index four index 200000 1 3.56 1.35 1.05
secondary index no secondary 200000 1 0.88 2.48 2.1
secondary index one index 200000 1 1.07 2.51 2.21
secondary index two index 200000 1 1.91 1.81 1.49
sorting shuffled 1mm 1000000 0 4.93 2.89 2.63
sorting sorted 1mm 1000000 1 4.92 2.94 2.57
github-actions[bot] commented 3 months ago
@coffeegoddd DOLT name detail mean_mult
dolt_blame_basic system table 1.27
dolt_blame_commit_filter system table 3.36
dolt_commit_ancestors_commit_filter system table 0.83
dolt_commits_commit_filter system table 0.91
dolt_diff_log_join_from_commit system table 2.07
dolt_diff_log_join_to_commit system table 2.11
dolt_diff_table_from_commit_filter system table 1.12
dolt_diff_table_to_commit_filter system table 1.12
dolt_diffs_commit_filter system table 1
dolt_history_commit_filter system table 1.22
dolt_log_commit_filter system table 0.94
github-actions[bot] commented 3 months ago
@coffeegoddd DOLT name add_cnt delete_cnt update_cnt latency
adds_only 60000 0 0 0.71
adds_updates_deletes 60000 60000 60000 3.72
deletes_only 0 60000 0 1.84
updates_only 0 0 60000 2.41
github-actions[bot] commented 3 months ago
@coffeegoddd DOLT test_name detail row_cnt sorted mysql_time sql_mult cli_mult
batching LOAD DATA 10000 1 0.05 1.8
batching batch sql 10000 1 0.07 1.86
batching by line sql 10000 1 0.07 1.86
blob 1 blob 200000 1 0.88 3.9 3.9
blob 2 blobs 200000 1 0.84 4.6 4.62
blob no blob 200000 1 0.86 2.49 2.16
col type datetime 200000 1 0.8 3.13 2.9
col type varchar 200000 1 0.67 3.6 3.3
config width 2 cols 200000 1 0.8 2.66 2.15
config width 32 cols 200000 1 1.83 2.01 2.56
config width 8 cols 200000 1 0.93 2.52 2.46
pk type float 200000 1 0.83 2.42 2.08
pk type int 200000 1 0.75 2.63 2.29
pk type varchar 200000 1 1.55 1.67 1.43
row count 1.6mm 1600000 1 5.55 2.95 2.56
row count 400k 400000 1 1.43 2.81 2.38
row count 800k 800000 1 2.8 2.92 2.51
secondary index four index 200000 1 3.51 1.38 1.09
secondary index no secondary 200000 1 0.88 2.48 2.14
secondary index one index 200000 1 1.09 2.45 2.18
secondary index two index 200000 1 1.9 1.79 1.49
sorting shuffled 1mm 1000000 0 4.83 2.96 2.64
sorting sorted 1mm 1000000 1 4.85 2.96 2.67
github-actions[bot] commented 3 months ago
@coffeegoddd DOLT name detail mean_mult
dolt_blame_basic system table 1.25
dolt_blame_commit_filter system table 3.34
dolt_commit_ancestors_commit_filter system table 0.86
dolt_commits_commit_filter system table 0.94
dolt_diff_log_join_from_commit system table 2.05
dolt_diff_log_join_to_commit system table 2.06
dolt_diff_table_from_commit_filter system table 1.1
dolt_diff_table_to_commit_filter system table 1.12
dolt_diffs_commit_filter system table 1
dolt_history_commit_filter system table 1.22
dolt_log_commit_filter system table 0.97
github-actions[bot] commented 3 months ago
@coffeegoddd DOLT name add_cnt delete_cnt update_cnt latency
adds_only 60000 0 0 0.71
adds_updates_deletes 60000 60000 60000 3.75
deletes_only 0 60000 0 1.86
updates_only 0 0 60000 2.47
github-actions[bot] commented 3 months ago
@coffeegoddd DOLT test_name detail row_cnt sorted mysql_time sql_mult cli_mult
batching LOAD DATA 10000 1 0.05 1.8
batching batch sql 10000 1 0.07 2.14
batching by line sql 10000 1 0.08 1.88
blob 1 blob 200000 1 0.87 4.11 4.03
blob 2 blobs 200000 1 0.88 4.67 4.55
blob no blob 200000 1 0.89 2.55 2.11
col type datetime 200000 1 0.8 3.15 2.9
col type varchar 200000 1 0.74 3.36 3.03
config width 2 cols 200000 1 0.75 3.16 2.29
config width 32 cols 200000 1 1.86 2.01 2.54
config width 8 cols 200000 1 1.05 2.26 2.7
pk type float 200000 1 0.82 2.46 2.15
pk type int 200000 1 0.78 2.56 2.23
pk type varchar 200000 1 1.6 1.63 1.38
row count 1.6mm 1600000 1 5.64 2.95 2.54
row count 400k 400000 1 1.4 2.93 2.49
row count 800k 800000 1 2.82 2.96 2.5
secondary index four index 200000 1 3.49 1.4 1.1
secondary index no secondary 200000 1 0.88 2.51 2.14
secondary index one index 200000 1 1.1 2.45 2.17
secondary index two index 200000 1 1.91 1.81 1.49
sorting shuffled 1mm 1000000 0 5.46 2.83 2.44
sorting sorted 1mm 1000000 1 5.49 2.8 2.45
github-actions[bot] commented 3 months ago
@coffeegoddd DOLT name detail mean_mult
dolt_blame_basic system table 1.26
dolt_blame_commit_filter system table 3.36
dolt_commit_ancestors_commit_filter system table 0.81
dolt_commits_commit_filter system table 0.89
dolt_diff_log_join_from_commit system table 2.09
dolt_diff_log_join_to_commit system table 2.06
dolt_diff_table_from_commit_filter system table 1.08
dolt_diff_table_to_commit_filter system table 1.1
dolt_diffs_commit_filter system table 0.97
dolt_history_commit_filter system table 1.2
dolt_log_commit_filter system table 0.94
github-actions[bot] commented 3 months ago
@coffeegoddd DOLT name add_cnt delete_cnt update_cnt latency
adds_only 60000 0 0 0.7
adds_updates_deletes 60000 60000 60000 3.73
deletes_only 0 60000 0 1.86
updates_only 0 0 60000 2.46