imzhenyu / rDSN

Robust Distributed System Nucleus (rDSN) is an open framework for quickly building and managing high performance and robust distributed systems.
MIT License
33 stars 11 forks source link

shared log not flush to disk/ssd before return set result to client #472

Open qinzuoyan opened 8 years ago

qinzuoyan commented 8 years ago

in mutation_log_shared::append(), the shared log is not flushed (fsync on linux) before commit, which may cause data lost when machine restart under strong consistency semantics.

shengofsun commented 8 years ago

In addition, fsync is a block io operation, which is unfriendly to rDSN, so should we consider "aio_fsync" and wrap it as a api of disk_aio?

imzhenyu commented 8 years ago

There is a currently a flush method in aio_provider. Can you guys do a survey to see whether flush and fsync are the same thing or not? aio_fsync is great - I don't realize there is an async version on Linux:)

shengofsun commented 8 years ago

@imzhenyu, in aio_provider of linux, the flush method calls fsync, but we don't see anywhere that calls aio_provider's flush during the prepare of a mutation log.

imzhenyu commented 8 years ago

Thanks, @shengofsun. There are two ways to ensure the data is really pushed to disk: fsync and use direct io. I guess in our case using direct IO is easier? (with a direct io flag when opening the log). It is more complicated on windows though as it requires aligned memory.

shengofsun commented 8 years ago

@imzhenyu did you refer to the O_DIRECT? Behavior of O_DIRECT in Linux is not hardware/filesystem-independent, O_SYNC/O_DSYNC should be the proper choice for us. See this Three issues: (1) is there a corresponding option in windows? (2) there are noticeable write amplification when syncing with hardware. I guess we should need a page cache in our disk engine. (3) I'm not sure whether linux AIO works fine with O_SYNC/O_DSYNC flags.

imzhenyu commented 8 years ago

Seems O_DIRECT + O_SYNC is good enough. But unfortunately, the buffer should be the same as the situation under Windows that it needs to be aligned. For your questions:

shengofsun commented 8 years ago

O_SYNC is enough for hard sync, but as far as I can see, O_DIRECT is not necessary. And if we write file with a buffer of page size, the write amplification issue should reduce. Of course, this also need to test.

qinzuoyan commented 8 years ago

so the resolution is to set O_SYNC flag when dsn_file_open() in log_file::create_write()?

imzhenyu commented 8 years ago

Let's try and see.