madecoste / swarming

Automatically exported from code.google.com/p/swarming
Apache License 2.0
0 stars 1 forks source link

Partition DiskCache into subdirectories to handle >100k local items (ala git object store) #141

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Most partitions (ext4, HFS, NTFS) do not like very large directories; 
add&remove operations slow down as the number of files increase in the 
directory. This is why there's a default 100000 limit of the number of files to 
keep in the local cache.

Note that currently, the limit of 20Gb is usually hit before 100k items limit 
so it's not yet a problem, as such, filing as a P3.

http://git-scm.com/docs/gitrepository-layout.html for background default about 
"objects/[0-9a-f][0-9a-f]". Note that I had used a similar design in dumbcas 
with a 3 letters layout by default. 
https://github.com/maruel/dumbcas/blob/master/cas_local.go.

I'm not sure if it would speed up or slow down enumeration. Likely using a 2 
letters (256) directory layout would be the best. 3 letters give 4096 
directories to enumerate, which will likely become the bottleneck at startup.

Original issue reported on code.google.com by maruel@chromium.org on 15 Aug 2014 at 2:13

GoogleCodeExporter commented 9 years ago
Raising priority

Original comment by maruel@chromium.org on 30 Sep 2014 at 5:25

GoogleCodeExporter commented 9 years ago
Why would you need to enumerate the directories?

For what it's worth, at Yahoo! long ago we had a very early DHT design that 
worked around this by a three level split, \h\h/\h\h/\h\h/object , which scales 
up to to billions of objects. We could probably get away with just two levels 
pretty easily ...

Original comment by dpranke@chromium.org on 8 Oct 2014 at 5:16

GoogleCodeExporter commented 9 years ago
When run_isolated starts up, it loads its local cache via a state.json but also 
ensure the files are actually present by enumerating the cache directory. Since 
there's a fair churn rate on the directory, it'll likely make the directory 
itself fairly fragmented. As the cache gets larger, this issue will likely have 
more effect.

Original comment by maruel@chromium.org on 8 Oct 2014 at 5:21