Closed yuyang0423 closed 4 years ago
when sync=true , it will call FileDescriptor#sync
in ProtoBufFile#save
method:
if (sync) {
fOut.getFD().sync();
}
The problem is that it forgot to call output.flush
before calling fd.sync
, the written data is in BufferedOutputStream
but not wrote into underlying output stream(file system), so in critical situation such as you describe, it may lost the data even setting syncMeta=true
.
The Utils.atomicMoveFile
use Files.move
API, it will keep the consistent between source and target on linux platform, so it's fine.
when sync=true , it will call
FileDescriptor#sync
inProtoBufFile#save
method:if (sync) { fOut.getFD().sync(); }
The problem is that it forgot to call
output.flush
before callingfd.sync
, the written data is inBufferedOutputStream
but not wrote into underlying output stream(file system), so in critical situation such as you describe, it may lost the data even settingsyncMeta=true
.The
Utils.atomicMoveFile
useFiles.move
API, it will keep the consistent between source and target on linux platform, so it's fine.
Thanks your quick reply, you can try with my steps. Utils.atomicMoveFile
may doesn't works as expects.
@yuyang0423 We will try it. You can try my fix if you can https://github.com/sofastack/sofa-jraft/pull/481
Thank you very much. Java will take care of Files.move
in theory.
@yuyang0423 We will try it. You can try my fix if you can #481
Thank you very much. Java will take care of
Files.move
in theory.
I do agree with output.flush()
fix, but really need fully test the move code in this scenario.
flush & sync a temp file is useless, we need guarantee new metadata is saved & sync to disk.
That is why I report this issue immediately.
@yuyang0423 If you want a atomic write to a file (all or nothing), write a temp file and do a atomic move is a common pattern.
We will test it with more cases and check the Files.move
source code, thank you.
@yuyang0423 If you want a atomic write to a file (all or nothing), write a temp file and do a atomic move is a common pattern.
We will test it with more cases and check the
Files.move
source code, thank you.
Yes, I do know what you means. Anyway, hope your news.
@killme2008 Reproduced many times even add you fix, I still don't think atomic move is a proper way to metadata file.
Files#move(Path source, Path target, CopyOption... options)
"atomic move" guaranteed "atomic" operation between source file and target file.
What if the final target file still in OS disk cache??
From test results, the target file obviously in OS cache which let snapshot metadata in dangerous situation.
Utils.atomicMoveFile
should complete two things:
syncMeta=true
is set;@yuyang0423 hi, sorry , the fix is not enough, we should call fsync on the containing directory after the atomic move operation. The correct steps should be:
I forgot the step 5. I updated the PR with the new fix, thank you.
@yuyang0423 hi, sorry , the fix is not enough, we should call fsync on the containing directory after the atomic move operation. The correct steps should be:
- create a new temp file (on the same file system!)
- write data to the temp file
- fsync() the temp file
- rename the temp file to the appropriate name
- fsync() the containing directory
I forgot the step 5. I updated the PR with the new fix, thank you.
Yes, you got me. I also do a simple fix. Can you help review this??
diff --git a/jraft-core/src/main/java/com/alipay/sofa/jraft/storage/io/ProtoBufFile.java b/jraft-core/src/main/java/com/alipay/sofa/jraft/storage/io/ProtoBufF
index 14a6fe2..0ffb2c0 100644
--- a/jraft-core/src/main/java/com/alipay/sofa/jraft/storage/io/ProtoBufFile.java
+++ b/jraft-core/src/main/java/com/alipay/sofa/jraft/storage/io/ProtoBufFile.java
@@ -100,6 +100,7 @@ public class ProtoBufFile {
public boolean save(final Message msg, final boolean sync) throws IOException {
// Write message into temp file
final File file = new File(this.path + ".tmp");
+ boolean moved = false;
try (final FileOutputStream fOut = new FileOutputStream(file);
final BufferedOutputStream output = new BufferedOutputStream(fOut)) {
final byte[] lenBytes = new byte[4];
@@ -115,11 +116,13 @@ public class ProtoBufFile {
Bits.putInt(lenBytes, 0, msgLen);
output.write(lenBytes);
msg.writeTo(output);
- if (sync) {
+ output.flush();
+ moved = Utils.atomicMoveFile(file, new File(this.path));
+ if (moved && sync) {
fOut.getFD().sync();
}
}
- return Utils.atomicMoveFile(file, new File(this.path));
+ return moved;
}
}
@yuyang0423 I think it's not a proper fix, it depends on jdk implementation in move
and sync
. We can't ensure that fOut.getFD().sync()
will always work after moving the file on different platform or file system(ext3, ext4, btrfs etc.)
@yuyang0423 I think it's not a proper fix, it depends on jdk implementation in
move
andsync
. We can't ensure thatfOut.getFD().sync()
will always work after moving the file on different platform or file system(ext3, ext4, btrfs etc.)
ok, I'll try you latest fix.
@killme2008 problem solved with your new fix 👍 . You should consider to bump a version merge this fix and let everyone avoid this. 😀
@yuyang0423 Yep, it's a fatal issue, we will release a new version ASAP.
Describe the bug
Utils.atomicMoveFile() can't guaranteed data consistency and sync to disk.
sofa-jraft/jraft-core/src/main/java/com/alipay/sofa/jraft/storage/io/ProtoBufFile.java::save() method has issues which can case snapshot metadata lost.
which let syncMeta = true doesn't works as expect;
Risk 1: no output flush after output write. some data in buffer cache;
Risk 2: Like 1, move tmp file to final metadata file looks like a atomic operation but the target file isn't sync to disk.
Expected behavior
If enduser do concerned about data safety, current code cant guaranteed this.
Actual behavior
In our regular test, we see this symptoms: (sorry, hide some path infos)
total 16K drwxr-xr-x. 4 root root 33 Jul 3 08:29 kv -rw-r--r--. 1 root root 16K Jul 2 20:14 kv.zip _-rw-r--r--. 1 root root 0 Jul 2 20:14 __raft_snapshotmeta
_raft_snapshot_meta file size 0, which cause node init failed;
2020-07-02 20:26:15.834 [main] INFO FSMCallerImpl:201 - Starts FSMCaller successfully. 2020-07-02 20:26:15.845 [main] ERROR LocalSnapshotReader:91 - Fail to load snapshot meta $PATH_TO_SNAPSHOT/snapshot/snapshot_1065/__raft_snapshot_meta. 2020-07-02 20:26:15.846 [main] ERROR LocalSnapshotStorage:313 - Fail to init reader for path $PATH_TO_SNAPSHOT/snapshot/snapshot_1065. 2020-07-02 20:26:15.846 [main] ERROR NodeImpl:977 - Node $NODE_INFO is initialized with inconsistent log, status=Status[EIO<1014>: Missing logs in (0, 778)].
Steps to reproduce
There is a unit test can reproduce this issue, if you do some modifications.
sofa-jraft/jraft-core/src/test/java/com/alipay/sofa/jraft/storage/io/ProtobufFileTest.java::testSaveLoad()
assertTrue(file.save(msg, true));
, after panic server back, you will see this issue with that file.but if you
sync
before panic, this won't happen. That is why I say, syncMeta = true doesn't works as expect;Minimal yet complete reproducer code (or GitHub URL to code)
Environment
SOFAJRaft version: v1.3.0
JVM version (e.g.
java -version
): openjdk version "1.8.0_212" OpenJDK Runtime Environment (build 1.8.0_212-b04) OpenJDK 64-Bit Server VM (build 25.212-b04, mixed mode)OS version (e.g.
uname -a
): Kernel: 5.4.22 OS Version: Centos &Maven version:
IDE version: