Open ikmak opened 1 year ago
@willscott
I've tried to mount with options locallocks
, and it looks better.
But there are still some problems.
Word docx(doc) files could be saved with some wired temporary directories like abc.docx.sb-xxxxxx-xxxxx
Excel xlsx(xls) files could be saved without any sb
temporary directory
PowerPoint pptx(ppt) files could be created but couldn't be saved.
sudo ./example/osnfs /path/to/files/ 45678
sudo mount_nfs -vvv -o port=45678,mountport=45678,proto=tcp,nosuid,sync,locallocks "localhost:/nfs" /path/to/mount
I wonder if https://github.com/willscott/go-nfs/pull/72 was the underlying issue behind this
can you try again with the current version?
@willscott Same...
Does anyone have a clear idea what is happening here? This is affecting rclone users in https://github.com/rclone/rclone/issues/7973 in exactly the same way.
Putting nolock
or locallocks
doesn't fix the problem though - the symptoms are as described above https://github.com/willscott/go-nfs/issues/54#issuecomment-1345932088
Is this a problem because the nfs server doesn't implement locking? If so could we implement a skeleton locking system?
I notice that memfs has a stubbed out Lock and Unlock which don't appear to be called anywhere
@ncw - the reason memfs does not implement lock / unlock is because the accompanying 'Network Lock Manager (NLM) protocol' is not implemented here. The protocol is documented in RFC 1813 - but it will be a bigger effort than the method you point to.
Thanks @willscott
What do you think the best way forward with this issue is? Rclone can't support locking so how can we tell the clients not to try?
If we can get logging of the NFS calls made while opening a word document, I'm not totally convinced that the underlying issue is lack of the locking protocol (especially since it has been observed in cases where the mount is done with local / no locks). If there's some other NFS call that is giving an error or otherwise appears to be acting in a way that might be suspicious I don't know if that's been fully ruled out at this point.
Otherwise, the next step may be to try to scaffold enough of NLMv4 to accept lock/test calls and see if that changes behavior. https://pubs.opengroup.org/onlinepubs/9629799/chap14.htm
- If we can get logging of the NFS calls made while opening a word document, I'm not totally convinced that the underlying issue is lack of the locking protocol (especially since it has been observed in cases where the mount is done with local / no locks). If there's some other NFS call that is giving an error or otherwise appears to be acting in a way that might be suspicious I don't know if that's been fully ruled out at this point.
What is the best way of doing this? The logging in the library doesn't seem to do a full rpc in/out trace or at least I haven't figured out how to make it do that if it can.
I'm not 100% convinced that this problem is to do with locking so tracing sound like a great idea
- Otherwise, the next step may be to try to scaffold enough of NLMv4 to accept lock/test calls and see if that changes behavior. https://pubs.opengroup.org/onlinepubs/9629799/chap14.htm
It doesn't look too complicated...
There's a log trace on each request: https://github.com/willscott/go-nfs/blob/master/conn.go#L62
There's a major issue with NLM in this context - which is that from what i can tell mount'ing of NFS (the clients in Mac and Linux at least) don't allow specification of an explicit NLM port. As such, the server needs to be running in the standard privileged context where portmap is running and used to discover the port the lock manager is running on. I don't see a way to allow interaction with the locking sideband protocol with the same explicit mount-time behavior as with mountd and nfs itself
There's a log trace on each request: https://github.com/willscott/go-nfs/blob/master/conn.go#L62
Here is a trace of those log statements
I extracted these from the rclone log which gives a bit more context
I did not notice this line before
2024/07/30 17:58:02 DEBUG : [NFS DEBUG] failing create to indicate lack of support for 'exclusive' mode.
I wonder if that is the problem? Can we just ignore the exclusive mode?
That comes from here
I'm attempting an experiment using this patch. I've made an rclone binary with this patch and sent it to the user with the mac (not me!).
diff --git a/nfs_oncreate.go b/nfs_oncreate.go
index 63d7901..262eb8f 100644
--- a/nfs_oncreate.go
+++ b/nfs_oncreate.go
@@ -39,9 +39,8 @@ func onCreate(ctx context.Context, w *response, userHandle Handler) error {
if err := xdr.Read(w.req.Body, &verf); err != nil {
return &NFSStatusError{NFSStatusInval, err}
}
- Log.Errorf("failing create to indicate lack of support for 'exclusive' mode.")
- // TODO: support 'exclusive' mode.
- return &NFSStatusError{NFSStatusNotSupp, os.ErrPermission}
+ // We ignore the createverf3 which is the key to the lock
+ Log.Errorf("ignoring 'exclusive' mode on create file.")
} else {
// invalid
return &NFSStatusError{NFSStatusNotSupp, os.ErrInvalid}
@@ -88,9 +87,11 @@ func onCreate(ctx context.Context, w *response, userHandle Handler) error {
fp := userHandle.ToHandle(fs, newFile)
changer := userHandle.Change(fs)
- if err := attrs.Apply(changer, fs, newFilePath); err != nil {
- Log.Errorf("Error applying attributes: %v\n", err)
- return &NFSStatusError{NFSStatusIO, err}
+ if attrs != nil {
+ if err := attrs.Apply(changer, fs, newFilePath); err != nil {
+ Log.Errorf("Error applying attributes: %v\n", err)
+ return &NFSStatusError{NFSStatusIO, err}
+ }
}
writer := bytes.NewBuffer([]byte{})
Which I think should work (cross fingers!)
There was one thing I didn't understand in that code though - the code appears to assume that the file already exists (ie has a handle and can have Stat called on it without an error) whereas the RFC doesn't say anything about the file existing or not. I've probably mis-understood the code or the RFC as I'm not very familiar with either but I just thought I'd mention it.
@ncw perhaps also try with my branch at https://github.com/willscott/go-nfs/tree/feat/exclusive-create and see if it makes a difference
Alas this does not compile on Linux
github.com/willscott/go-nfs/file
# github.com/willscott/go-nfs/file
../../../../pkg/mod/github.com/willscott/go-nfs@v0.0.3-0.20240826094117-6dc01bc0f44d/file/file_unix.go:22:26: s.Atimespec undefined (type *"syscall".Stat_t has no field or method Atimespec)
../../../../pkg/mod/github.com/willscott/go-nfs@v0.0.3-0.20240826094117-6dc01bc0f44d/file/file_unix.go:23:26: s.Ctimespec undefined (type *"syscall".Stat_t has no field or method Ctimespec)
Annoyingly the names of the Atimespec and Ctimespec change on a per OS basis
$ GOOS=linux go doc syscall.Stat_t | grep Timespec
Atim Timespec
Mtim Timespec
Ctim Timespec
$ GOOS=darwin go doc syscall.Stat_t | grep Timespec
Atimespec Timespec
Mtimespec Timespec
Ctimespec Timespec
Birthtimespec Timespec
As for rclone, this locking scheme is unlikely to work as-is as backends don't support atime/ctime. Something based on mtime alone would work, but bear in mind that the precision might be truncated depending on the backend.
Probably my preference would be to have a way to tell the library just to treat exclusive creates the same as non-exclusive ones.
I think the approach i have should work in a degraded way - if there isn't an exact match, a temporary file is made to save and recover the verifier. In cases where atime/ctime aren't supported, or things like dos with lower time precision, the intention is to degrade.
Thanks for flagging the build issue. I need some more os-specific casing
Sounds good :-)
As an aside while we are talking about how annoyingsyscall.Stat_t
, it is very annoying to work with because it is very OS dependent and even if the names of the fields are the same, they quite often have different types eg uint32
vs uint64
so using this as an extension struct as the output of FileInfo.Sys()
isn't ideal other than making the reading from os.File
based files easy.
I'd prefer you used a struct you defined specifically (probably your file.Fileinfo struct). You could make this change in a backwards compatible way by making file.GetInfo look for a file.Fileinfo
struct as the output of Sys()
too.
I think that's what that method does, right? if the Sys()
returns a FileInfo or *FileInfo it is directly returned, rather than being parsed from the unstable OS struct.
I think that's what that method does, right? if the
Sys()
returns a FileInfo or *FileInfo it is directly returned, rather than being parsed from the unstable OS struct.
It does, you are quite right :blush:
It should probably say that here - I will send a PR to fix.
Does the addition of exclusive create change office behavior?
Sorry, didn't see you updated your branch. Will test tomorrow
Just a FYI - you can't go get
your branch because apparently go tags can't have /
in! It is no big deal to use the commit hash though.
The branch compiles much better now, but it didn't compile for these GOOS/GOARCH pairs
linux/386
linux/arm
linux/mips
linux/mipsle
freebsd/386
freebsd/amd64
freebsd/arm
netbsd/386
netbsd/amd64
netbsd/arm
openbsd/386
With a selection of errors like
Error: ../../../../go/pkg/mod/github.com/willscott/go-nfs@v0.0.3-0.20240826115235-e7154b306671/file/file_unix.go:22:36: cannot use s.Atim.Nsec (variable of type int32) as int64 value in argument to time.Unix
Error: ../../../../go/pkg/mod/github.com/willscott/go-nfs@v0.0.3-0.20240826115235-e7154b306671/file/file_unix.go:22:26: s.Atim undefined (type *syscall.Stat_t has no field or method Atim)
You can check out the full selection in rclone's build logs 1 2
It built on darwin though so I've sent that off to my testers
I went through this pain in rclone so if you want to see what I came up with I needed 3 build files for bsd, linux and other unix.
I've had a report back from the user doing the testing. Alas, it did not work.
Here is the nfs logs from go-nfs set to Trace level debug
fix-7973-nfs-server-ids-ff8ce1bef.nfs.log
Which were grepped out of the rclone logs here which have more context as to what the backend is doing
fix-7973-nfs-server-ids-ff8ce1bef.log
I annotated some of the rclone log with the nfs transactions below. I got bored of annotating the stat calls to the root and I stopped after the create.
The description of what the user did
I attempt to open this Word document. It is opened but with message that it is read-only and I should create duplicate in order to edit so I try to save it as "test1.docx". It fails.
There is just one file available.
I think if we we could figure out why word thinks it is read only then we could probably fix the problem.
Your new code is visible setting atime/mtime below, but the OS overwrites them immediately so I don't think this strategy is going to work. I think just ignoring the exclusive lock is probably the best strategy.
Everything works fine except editing Microsoft Office files(e.g. Word, Excel, PowerPoint) from mountpoint. It can be edited if the file open directly from source folder, but when opening from mountpoint the Microsoft Office shows "readonly".
./example/osnfs
mount