basho / machi

Machi file store
Apache License 2.0
122 stars 23 forks source link

{error, written} handling in cr client when writing #39

Open shino opened 8 years ago

shino commented 8 years ago

Spin off from #33 .

For the first one bad_return_value, quick fix is at 903e939, just reply tag missing for handle_call return. I will create another PR with small eunit unless too difficult

Superficial reason of bad_return_value is reply tag as explained by quick fix [1]. But it seems like the root cause is double write as other two errors found.

bad_return_value was occurred in call from do_append_midtail2 to do_repair_chunk in machi_cr_client. The current code triggers repair by {error, written} [2] but FLU returns ok when "correct" chunk is written [3].

The quick fix [1] will hide conflict in data and probably returns false ok.

[1] https://github.com/basho/machi/commit/903e9395c8a32af8ae7440ab8d0fb4d170c93cfd [2] https://github.com/basho/machi/blob/master/src/machi_cr_client.erl#L434 [3] https://github.com/basho/machi/blob/master/src/machi_file_proxy.erl#L683-L696

kuenishi commented 8 years ago

I think do_repair_chunks/10 should be used instead of do_repair_chunk at 903e939. And it looks like right fix regardless of the exact issue (though it's a kind of type mismatch that must be detected by dialyzer).

shino commented 8 years ago

type mismatch that must be detected by dialyzer).

Yes, I tried -Wspecdiffs, dialyzer ouput many lines ;)

kuenishi commented 8 years ago

But {error, written} indicates many subtle situation, which you could find in machi_file_proxy:handle_write() so maybe we should take deeper insight on this. I look like a patch for this issue (which is being tested now).

For debugging purpose, we'd better have context information passed from cr_client to file_proxy that a write is whether in an append (writes are not supposed to overlap in midtail, and it warns that something is wrong except trim) or in a repair (it's like a force-overwrite).

kuenishi commented 8 years ago
diff --git a/src/machi_cr_client.erl b/src/machi_cr_client.erl
index 0d74f61..0728e11 100644
--- a/src/machi_cr_client.erl
+++ b/src/machi_cr_client.erl
@@ -439,8 +439,10 @@ do_append_midtail2([FLU|RestFLUs]=FLUs, Prefix, File, Offset, Chunk,
             %% We know what the chunk ought to be, so jump to the
             %% middle of read-repair.
             Resume = {append, Offset, iolist_size(Chunk), File},
-            do_repair_chunk(FLUs, Resume, Chunk, [], File, Offset,
-                            iolist_size(Chunk), Depth, STime, S);
+            Chunks = [{File, Offset, iolist_size(Chunk), machi_util:checksum_chunk(Chunk)}],
+            {Reply, S1} = do_repair_chunks(Chunks, RestFLUs, Resume, [FLU], File, 
+                                           Depth, STime, S, {ok, Chunks}),
+            {reply, Reply, S1};
         {error, trimmed} = Err ->
             %% TODO: nothing can be done
             {reply, Err, S};

Although we have to use checksum given by client here, not generating by ourselves.

shino commented 8 years ago

What I want to confirm/settle down first is what {error, written} means. Current FLU server side implementation, it is used in response to append/write requests to indicate "already different bytes are written". CR client code seems to consider it as "write attempts to midtails are already dome with the bytes I want to write by someone else, e.g. repair". Is it better to use different error atom to indicate two cases?

shino commented 8 years ago

For debugging purpose, we'd better have context information passed from cr_client to file_proxy that a write is whether in an append (writes are not supposed to overlap in midtail, and it warns that something is wrong except trim) or in a repair (it's like a force-overwrite).

:+1:

It may be the same for reverse direction: FLU -> client. For exapmle, {error, {bad_epoch, FLUEpoch, RequestedEpoch}}.

kuenishi commented 8 years ago

Currently the file_proxy returns ok when a write request indicates exactly same data that a flu already has - that said, {error, written} indicates something is wrong (a requested bytes to write is different from what I have in disk), where we shouldn't trigger repair. If {error, trimmed} is returned then a repair should be triggered.