Open bcho opened 2 years ago
Thanks for your report. I haven’t used this project with changing files, so some rough edges are to be expected.
My guess is, this issue is due to in-flight file change + wrong read size settings (which fixed in this commit: f19f793). Could you confirm the fix commit f19f793 is for this behavior?
That commit has nothing to do with changing files, it was just a wrong calculation that manifested itself as files with certain file sizes never transferring correctly.
- record the state for the
mapStruct
during theptr
call to capture the value ofoffset
, lastreadSize
/readOffset
etc
Not quite sure what you mean by this?
mapStruct.ptr
only exits when its caller violates the invariants (that’s why the lines are all prefixed with BUG:
). This shouldn’t be possible to trigger from user input. Failing loudly is my preferred way of dealing with these sorts of issues in the current stage of development in this project.
- panic instead of
os.Exit(1)
for these unexpected cases- setup panic handler in the rsyncd server, then we can protect the server from panics like the one from this issue or from the
mapStruct.ptr
call
No, for the case of changing files, the buffer should be zeroed and the os.Exit should be removed. We don’t need to work around this issue, we need to fix the issue itself.
In general, I’m not a fan of panic handlers in server programs. Instead of recovering panics, the only safe move upon a panic is to terminate the program.
Hi @stapelberg thanks for your reply, fully understood. I am ok for not adding a panic handler. But how about changing the os.Exit(1) (which introduced by my previous change to the logger, and used by std log's fatal call) to panic? At least we can enable unit tests by using a panic.
I will maintain a fork for my application for now. I will report back if I can see new reproduce for the issue I mentioned in the issue.
But how about changing the os.Exit(1) (which introduced by my previous change to the logger, and used by std log's fatal call) to panic? At least we can enable unit tests by using a panic.
We could write unit tests by using panic/recover, but again, I would prefer to return errors (which also enables testing, and in a more convenient way), or — in this particular case — just fix the issue itself, in which case there is no need to panic/recover :)
Yep, but I am still finding it's hard to reproduce it yet. I managed to capture a reproduce last week. In my case, I can see: https://github.com/gokrazy/rsync/blob/f3bbd0836728c03a03b1c26b86c040f277282ac7/rsyncd/fileio.go#L105-L106
is returning io.EOF
. Anything else we should inspect / check?
It’s possible that you run into io.EOF if the file shrunk during transfer, but we don’t need to check for io.EOF explicitly — the existing conditional handles all error types, as it should. Should be relatively easy to reproduce using truncate(1)
, I think.
Hey,
Background
I am building an application that makes use of rsync as a file server to sync files to remote. In my implementation, it would potentially that the file contents changed during the file sync.
Issue & Symptom
I am seeing some random panics from the rsync call. Stack trace looks like this:
I can reproduce this issue from the commit: https://github.com/gokrazy/rsync/commit/b567d9d1a3029ed39da65e3641471f058efa5d23 . But I am not yet reproduce this issue in commits after this.
Unfortunately, I don't have a stable way to reproduce this issue (I am trying to automate the process, but no luck yet).
My guess
My guess is, this issue is due to in-flight file change + wrong read size settings (which fixed in this commit: https://github.com/gokrazy/rsync/commit/f19f7934a18784e347bc91b5811db187d8383f3d). Could you confirm the fix commit f19f793 is for this behavior?
Further Asks
I checked the code, the
fileio.go
is doing some log trackings (which are commented out), and it also exit for unexpected data state: https://github.com/gokrazy/rsync/blob/f3bbd0836728c03a03b1c26b86c040f277282ac7/rsyncd/fileio.go#L108-L111Is it possible to:
mapStruct
during theptr
call to capture the value ofoffset
, lastreadSize
/readOffset
etcos.Exit(1)
for these unexpected casesmapStruct.ptr
callWhat do you think?