chrsmithdemos / leveldb

Automatically exported from code.google.com/p/leveldb
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Leveldb keeps generating small sst file #169

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
here is leveldb.stats ouputs:

                                   Compactions
    Level  Files Size(MB) Time(sec) Read(MB) Write(MB)
    --------------------------------------------------
      0        0        0         0        0        36
      2        0        0         9        0       519
      3       30        4        12      594       580
      4      530    10070      1187   101893    101892
      5     1750    52946      7101   534959    534716

Level-3 has 30 files, but it only has 4MB size. Then these 30 files will be 
merged to level-4, but the newly created level-4 sst files is small too, I can 
see that with ls command.

This leads to frequently compaction after written 4MB data.

What is the expected output? What do you see instead?

small sst file should be merged.

What version of the product are you using? On what operating system?

Linux

Please provide any additional information below.

kTargetFileSize = 32 * 1048576

Original issue reported on code.google.com by wuzuy...@gmail.com on 15 May 2013 at 2:08

GoogleCodeExporter commented 9 years ago
This show how it happens:

                                  Compactions
    Level  Files Size(MB) Time(sec) Read(MB) Write(MB)
    --------------------------------------------------
      0        0        0         0        0        11
      2        1       39         2        0       118
      3        0        0         2       90        88
      4      386    10062       354    30228     30227
      5     1798    53047      1897   158227    158183

                                   Compactions
    Level  Files Size(MB) Time(sec) Read(MB) Write(MB)
    --------------------------------------------------
      0        0        0         0        0        11
      2        0        0         2        0       118
      3       32       39         2      130       127
      4      386    10062       354    30228     30227
      5     1798    53047      1897   158227    158183

                                   Compactions
    Level  Files Size(MB) Time(sec) Read(MB) Write(MB)
    --------------------------------------------------
      0        0        0         0        0        11
      2        0        0         2        0       118
      3       29        3         2      130       127
      4      542    10092       361    30850     30849
      5     1798    53049      1919   159486    159440

Original comment by wuzuy...@gmail.com on 15 May 2013 at 5:08

GoogleCodeExporter commented 9 years ago
I see this line in leveldb doc:

```We also switch to a new output file when the key range of the current output 
file has grown enough to overlap more then ten level-(L+2) files.```

I assume that the only file in level-2 overlaps many files in level-4, so it 
splits into 32 level-3 files. I don't think this mechanism is good, because 
later Get() operations will force seek_compactions on all these level-3 files 
to be merged with level-4 files. 

Original comment by wuzuy...@gmail.com on 15 May 2013 at 5:39

GoogleCodeExporter commented 9 years ago
I believe I'm seeing this too.

... 270 small files created
03:13:21.412464 Generated table #1073824: 13 keys, 7209 bytes   
03:13:21.504346 Generated table #1073825: 15 keys, 8345 bytes 
03:13:21.597557 Generated table #1073826: 4 keys, 2437 bytes  
03:13:21.679429 Generated table #1073827: 9 keys, 4581 bytes  
03:13:21.771315 Generated table #1073828: 6 keys, 2784 bytes  
03:13:21.853079 Generated table #1073829: 12 keys, 5856 bytes 
03:13:21.934807 Generated table #1073830: 10 keys, 13350 bytes
03:13:22.016480 Generated table #1073831: 11 keys, 6459 bytes 
03:13:22.108478 Generated table #1073832: 10 keys, 5504 bytes 
03:13:22.200398 Generated table #1073833: 17 keys, 17318 bytes
03:13:22.282151 Generated table #1073834: 13 keys, 23513 bytes
03:13:22.384259 Generated table #1073835: 9 keys, 4937 bytes  
03:13:22.466041 Generated table #1073836: 7 keys, 3846 bytes  
03:13:22.547809 Generated table #1073837: 18 keys, 25082 bytes
03:13:22.547833 Compacted 1@2 + 16@3 files => 16725003 bytes  
03:13:22.629348 compacted to: files[ 0 0 36 682 5304 359 0 ]  
03:13:22.640214 Delete type=2 #1073495
03:13:22.640276 Delete type=2 #1073500
03:13:22.640396 Delete type=2 #1073503
03:13:22.640439 Delete type=2 #1073501
03:13:22.640615 Delete type=2 #1073505
03:13:22.640873 Delete type=2 #1073511
03:13:22.640966 Delete type=2 #1073502
03:13:22.641269 Delete type=2 #1073506
03:13:22.641372 Delete type=2 #1073508
03:13:22.641434 Delete type=2 #1073493
03:13:22.641488 Delete type=2 #1073512
03:13:22.641619 Delete type=2 #1073528
03:13:22.641658 Delete type=2 #1073494
...

Original comment by DavidJoelSchwartz@gmail.com on 17 May 2013 at 1:23