burkemw3 / syncthingfuse

Mozilla Public License 2.0
68 stars 7 forks source link

Opening files is slow on hard disk drives #1

Open burkemw3 opened 8 years ago

burkemw3 commented 8 years ago

On a linux box running SyncthingFUSE, opening files is drastically slower than on OSX, even after having the files cached locally. Linux takes 10s of seconds, while OSX takes single digit seconds.

The linux box does have spinning platters, while the OS X box has an SSD, which might make a difference.

burkemw3 commented 8 years ago

I don't have access to a Linux box for a little bit. When I regain access, I am planning to try mounting the cache storage folder as a tmpfs to help determine if it's the hard disk or something else.

burkemw3 commented 8 years ago

Also, possibly try avoiding copies with os.File.ReadAt and/or bufio.Reader.WriteTo

burkemw3 commented 8 years ago

Not looking good.

Average time of opening a 2.6mb file in different situations 5 times:

So, something is going haywire with the disk access. Reading everything off the disk sequentially (time find . -type f -exec dd if={} of=/dev/null \;) takes no time at all. When looking at strace on the slow version, nothing jumps out at me (but I'm also not experienced in such matters).

I looked at putting the block data into boltdb directly, hoping it'd be smart at handling file access. Unfortunately, bolt can't return free pages to the disk when the data stored decreases. This means, that reducing the configured cache size would not reduce the amount of space used on the disk.

I'm not sure what to try next.

I hope to get access to a Linux box with an SSD soon, and may check performance there, to see if I get more clues.

burkemw3 commented 8 years ago

The FileBlockCache performs disk operations in a read-write boltdb transaction. boltdb only allows 1 read-write transaction at a time.

I was expecting the disk requests to all be going out at about the same time. The first couple requests might have a lot of seek. I was expecting the remaining requests to have been queued long enough that the disk could schedule them intelligently.

With only 1 read-write transaction, the requests are probably going out sequentially, and the disk is seeking everywhere it has to.

My next experiment may be testing performance if the disk requests happen more in parallel.

burkemw3 commented 8 years ago

I tried desequentializing the disk requests with the this patch (against 099bc95). Read performance went to 11.9 seconds, which is not nearly enough of an improvement to be worth it.

I've started reading about mmap as another possible solution. I think that'd be a big change to the code, so I'm hesitant to make that change right now.

I've also been thinking about making a small app for testing different access patterns, instead of trying to work everything into SyncthingFUSE. Hopefully, it'd be easier to test things including mmap.