apache / doris

Apache Doris is an easy-to-use, high performance and unified analytics database.
https://doris.apache.org
Apache License 2.0
11.8k stars 3.11k forks source link

[fix](merge-on-write) when full clone failed, duplicate key might occur #37001

Open zhannngchen opened 3 days ago

zhannngchen commented 3 days ago

Proposed changes

Issue Number: close #xxx

introduced by #31268

full clone failure may produce duplicate keys in mow table the bug would be triggered in the following condition:

  1. replica 0 miss version
  2. replica 0 try to do full clone from other replicas
  3. the full clone failed and the delete bitmap is overrided incorrectly
  4. replica 0 try to do incremental clone again and this time the clone succeed
  5. incremental clone can't fix the delete bitmap overrided by previous failed full clone
  6. duplicate key occurred

solution: for full clone, don't override the delete bitmap, use merge() method instead.

doris-robot commented 3 days ago

Thank you for your contribution to Apache Doris. Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website. See Doris Document.

github-actions[bot] commented 3 days ago

clang-tidy review says "All clean, LGTM! :+1:"

github-actions[bot] commented 3 days ago

clang-tidy review says "All clean, LGTM! :+1:"

zhannngchen commented 3 days ago

run buildall

github-actions[bot] commented 3 days ago

clang-tidy review says "All clean, LGTM! :+1:"

doris-robot commented 3 days ago
TPC-H: Total hot run time: 39816 ms ``` machine: 'aliyun_ecs.c7a.8xlarge_32C64G' scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools Tpch sf100 test result on commit b3ce80c177b3584ac509c8c60af876485a4ac8a0, data reload: false ------ Round 1 ---------------------------------- q1 17603 4632 4287 4287 q2 2017 197 199 197 q3 10471 1177 1115 1115 q4 10202 819 870 819 q5 7464 2693 2604 2604 q6 216 138 136 136 q7 955 602 602 602 q8 9218 2072 2077 2072 q9 8780 6491 6469 6469 q10 8990 3735 3725 3725 q11 461 247 236 236 q12 507 238 228 228 q13 17777 3028 3001 3001 q14 271 236 221 221 q15 520 466 483 466 q16 494 391 381 381 q17 962 745 675 675 q18 7966 7415 7284 7284 q19 5528 1506 1541 1506 q20 669 323 331 323 q21 4889 3131 3319 3131 q22 399 338 351 338 Total cold run time: 116359 ms Total hot run time: 39816 ms ----- Round 2, with runtime_filter_mode=off ----- q1 4350 4260 4213 4213 q2 382 263 275 263 q3 2993 2833 2926 2833 q4 1942 1739 1648 1648 q5 5681 5483 5488 5483 q6 224 134 131 131 q7 2191 1905 1854 1854 q8 3243 3421 3423 3421 q9 8703 8666 8866 8666 q10 4113 3903 3736 3736 q11 603 505 531 505 q12 823 649 631 631 q13 17198 3197 3176 3176 q14 300 267 295 267 q15 529 475 485 475 q16 496 440 428 428 q17 1815 1526 1517 1517 q18 8144 7970 7754 7754 q19 1849 1716 1613 1613 q20 2081 1853 1832 1832 q21 5294 4963 4738 4738 q22 633 590 542 542 Total cold run time: 73587 ms Total hot run time: 55726 ms ```
doris-robot commented 3 days ago
TPC-DS: Total hot run time: 174671 ms ``` machine: 'aliyun_ecs.c7a.8xlarge_32C64G' scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools TPC-DS sf100 test result on commit b3ce80c177b3584ac509c8c60af876485a4ac8a0, data reload: false query1 920 387 370 370 query2 6450 2510 2348 2348 query3 6629 209 212 209 query4 18958 17470 17288 17288 query5 3693 486 502 486 query6 264 184 176 176 query7 4590 293 300 293 query8 327 308 299 299 query9 8771 2459 2459 2459 query10 554 294 280 280 query11 10637 10093 10144 10093 query12 121 88 84 84 query13 1655 373 380 373 query14 10300 7694 7732 7694 query15 251 186 194 186 query16 7943 275 270 270 query17 1880 567 537 537 query18 2073 283 290 283 query19 201 168 157 157 query20 90 82 88 82 query21 261 135 124 124 query22 4410 4085 3973 3973 query23 33912 33577 33580 33577 query24 10556 2988 2770 2770 query25 582 390 373 373 query26 712 154 155 154 query27 2260 321 333 321 query28 6081 2208 2204 2204 query29 890 628 622 622 query30 267 160 157 157 query31 978 755 754 754 query32 100 60 58 58 query33 687 298 286 286 query34 878 478 492 478 query35 734 640 634 634 query36 1102 1016 996 996 query37 155 75 73 73 query38 2949 2898 2863 2863 query39 892 836 825 825 query40 214 128 123 123 query41 58 52 54 52 query42 119 102 104 102 query43 604 561 547 547 query44 1114 762 743 743 query45 193 168 168 168 query46 1082 708 737 708 query47 1857 1771 1771 1771 query48 379 303 299 299 query49 839 408 415 408 query50 766 380 385 380 query51 7033 6676 6720 6676 query52 98 110 89 89 query53 360 289 295 289 query54 872 451 440 440 query55 73 73 74 73 query56 290 257 273 257 query57 1119 1031 1046 1031 query58 259 245 239 239 query59 3304 3218 3186 3186 query60 327 275 280 275 query61 94 94 91 91 query62 584 435 453 435 query63 322 293 294 293 query64 8505 2265 1749 1749 query65 3194 3105 3161 3105 query66 746 336 386 336 query67 15351 14932 14947 14932 query68 4608 531 535 531 query69 476 326 320 320 query70 1123 1074 1182 1074 query71 392 271 285 271 query72 8024 5279 6096 5279 query73 751 329 320 320 query74 5976 5518 5519 5518 query75 3445 2682 2631 2631 query76 2462 979 924 924 query77 665 313 309 309 query78 10532 9828 9770 9770 query79 2332 514 536 514 query80 1134 475 475 475 query81 587 223 222 222 query82 855 109 111 109 query83 340 173 177 173 query84 263 90 87 87 query85 1890 293 282 282 query86 487 327 318 318 query87 3288 3116 3086 3086 query88 4213 2378 2338 2338 query89 477 400 394 394 query90 1759 191 191 191 query91 135 169 101 101 query92 61 49 55 49 query93 2398 517 505 505 query94 1177 189 187 187 query95 410 323 323 323 query96 598 266 271 266 query97 3221 3070 3074 3070 query98 228 202 195 195 query99 1114 866 833 833 Total cold run time: 269890 ms Total hot run time: 174671 ms ```
doris-robot commented 3 days ago
ClickBench: Total hot run time: 29.97 s ``` machine: 'aliyun_ecs.c7a.8xlarge_32C64G' scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools ClickBench test result on commit b3ce80c177b3584ac509c8c60af876485a4ac8a0, data reload: false query1 0.04 0.03 0.03 query2 0.08 0.04 0.04 query3 0.22 0.04 0.05 query4 1.67 0.07 0.07 query5 0.53 0.50 0.50 query6 1.13 0.72 0.72 query7 0.02 0.01 0.02 query8 0.06 0.04 0.04 query9 0.54 0.51 0.50 query10 0.54 0.54 0.55 query11 0.15 0.12 0.12 query12 0.15 0.12 0.12 query13 0.59 0.59 0.60 query14 0.75 0.77 0.78 query15 0.84 0.81 0.82 query16 0.35 0.36 0.37 query17 0.96 0.94 1.01 query18 0.25 0.22 0.27 query19 1.77 1.70 1.70 query20 0.02 0.01 0.01 query21 15.43 0.75 0.66 query22 3.56 8.42 1.56 query23 18.24 1.35 1.18 query24 2.09 0.22 0.22 query25 0.15 0.09 0.09 query26 0.27 0.18 0.18 query27 0.08 0.08 0.08 query28 13.21 1.03 1.00 query29 12.62 3.26 3.30 query30 0.26 0.06 0.06 query31 2.89 0.38 0.39 query32 3.25 0.47 0.48 query33 2.88 2.81 2.93 query34 16.95 4.41 4.41 query35 4.47 4.44 4.48 query36 0.65 0.46 0.49 query37 0.17 0.15 0.16 query38 0.15 0.15 0.14 query39 0.04 0.03 0.03 query40 0.18 0.15 0.14 query41 0.09 0.05 0.04 query42 0.05 0.05 0.04 query43 0.04 0.04 0.04 Total cold run time: 108.38 s Total hot run time: 29.97 s ```
github-actions[bot] commented 10 hours ago

PR approved by at least one committer and no changes requested.

github-actions[bot] commented 10 hours ago

PR approved by anyone and no changes requested.