dolthub / dolt

Dolt – Git for Data
Apache License 2.0
17.36k stars 488 forks source link

Archive index rework to make loading faster #8078

Closed macneale4 closed 1 week ago

macneale4 commented 1 week ago

The initial impl of archive indexes over optimized for space. This resulted in being 10x slower to load the index of archives than noms table files. To address this:

Testing indicates that on a 41 Gb archive file, this returned load performance to match classic table files, and the size of the index increased by about 350Mb (total ~ 1Gb)

coffeegoddd commented 1 week ago

@macneale4 DOLT

comparing_percentages
100.000000 to 100.000000
version result total
5ef8ca5 ok 5937457
version total_tests
5ef8ca5 5937457
correctness_percentage
100.0
coffeegoddd commented 1 week ago

@coffeegoddd DOLT

comparing_percentages
100.000000 to 100.000000
version result total
0b83495 ok 5937457
version total_tests
0b83495 5937457
correctness_percentage
100.0
coffeegoddd commented 1 week ago

@macneale4 DOLT

comparing_percentages
100.000000 to 100.000000
version result total
24185cc ok 5937457
version total_tests
24185cc 5937457
correctness_percentage
100.0
github-actions[bot] commented 1 week ago

Additional work is required for integration with DoltgreSQL.

coffeegoddd commented 1 week ago

@macneale4 DOLT

comparing_percentages
100.000000 to 100.000000
version result total
b4f995a ok 5937457
version total_tests
b4f995a 5937457
correctness_percentage
100.0