akiradeveloper / dm-writeboost

Log-structured Caching for Linux
GNU General Public License v2.0
120 stars 19 forks source link

question about write durability #177

Closed aaronknister closed 7 years ago

aaronknister commented 7 years ago

Hi, I have a question about the durability of writes to a dm-writeboost device (apologies if this has been covered elsewhere. I couldn't find if it had).

One of the readme docs says this:

Because it creates a log that's contains 127 writes before it actually writes the log to the caching device, writing to the caching device happens only once in 127 writes while other caching drivers writes more often.

What does that mean for a write issued to the dm-writeboost device? If I issue the write using direct I/O (in theory to bypass any volatile caching mechanisms) once the write returns does dm-writeboost guarantee the data is committed to the cache device? Or, as the documentation suggest, does it hold onto the write until 126 others accumulate and then issue the write to the caching device?

Thanks!

-Aaron

akiradeveloper commented 7 years ago

@aaronknister

Short answer:

If I issue the write using direct I/O (in theory to bypass any volatile caching mechanisms) once the write returns does dm-writeboost guarantee the data is committed to the cache device?

No

does it hold onto the write until 126 others accumulate and then issue the write to the caching device?

Yes

Detailed answer:

To buffer write and create log, Writeboost has a space or layer called RAM buffer. It's different from filesystem's page cache but it's Writeboost's own internal.

direct-io in real semantics is to bypass page caching so it's totally irrelevant to the layer beneath, the block storage. That's why I use dd oflag=direct in my test suite to test Writeboost's logics purely (https://github.com/akiradeveloper/writeboost-test-suite/blob/develop/src/main/scala/dmtest/BlockDevice.scala).

FYI, to hit the write data to caching device you can use O_SYNC in filesystem (which will be interpreted to REQ_FLUSH in block system). This is not recommended but still happens often by coward users, so the case is bit optimized inside Writeboost as deferred barrier mechanism. If you like to know more, you can read the code.

I don't know why you want to hit all writes immediately to caching device but it's not basically recommended because block system's semantics isn't designed such a ridiculous way.

aaronknister commented 7 years ago

@akiradeveloper Thank you for your reply. I'm interested in using dm-writeboost underneath GPFS. GPFS assumes once a bio_put() returns that the data is committed to stable storage. It's an awful assumption but it is one that holds true for the majority of enterprise storage devices with non-volatile write caches. I don't agree with your assessment that the desire for all writes to immediately hitting the caching device is ridiculous but I recognize it's not efficient.

akiradeveloper commented 7 years ago

@aaronknister I don't know what bio_put are you talking about but there should be no implicit assumption but I guess GPFS just explicitly submit writes with flush flags.

aaronknister commented 7 years ago

@akiradeveloper I was thinking of this bio_put (https://www.kernel.org/doc/htmldocs/filesystems/API-bio-put.html) but I think what I meant was submit_bio(). I can't find any reference to GPFS submitting writes with flush flags :( That doesn't mean it's not there, though.

aaronknister commented 7 years ago

Thanks again for your help!