Open ncw opened 2 years ago
I can confirm that this is a problem. I am investigating why this is happening.
The relevant source in WinFsp is from file fuse_intf.c
:
if ('C' == f->env->environment) /* Cygwin */
fi.flags = 0x0200 | 0x0800 | 2 /*O_CREAT|O_EXCL|O_RDWR*/;
else
fi.flags = 0x0100 | 0x0400 | 2 /*O_CREAT|O_EXCL|O_RDWR*/;
This agrees with the contents of file fcntl.h
from the Windows SDK (C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\ucrt\fcntl.h
):
#define _O_CREAT 0x0100 // create and open file
#define _O_TRUNC 0x0200 // open and truncate
#define _O_EXCL 0x0400 // open only if file doesn't already exist
It also agrees with file fcntl.h
from MinGW (C:\Program Files\mingw-w64\x86_64-8.1.0-win32-seh-rt_v6-rev0\mingw64\x86_64-w64-mingw32\include\fcntl.h
):
#define _O_CREAT 0x0100
#define _O_TRUNC 0x0200
#define _O_EXCL 0x0400
It looks to me that the no-cgo version is incorrect. It should read:
O_TRUNC = 0x0200
O_EXCL = 0x0400
FYI, WinFsp will not send the O_TRUNC flag.
From one of the posts on the rclone forum, this is what rclone received from WinFSP/cgofuse which agrees with the code above.
2022/04/06 15:11:09 DEBUG : /test/testfile: CreateEx: flags=0x502, mode=0770
And this is how rclone interprets it using the cgo flags
2022/04/06 15:11:09 DEBUG : test/testfile: Open: flags=O_RDWR|O_CREATE|O_EXCL
So I think you are saying that is correct and O_EXCL
is meant not O_TRUNC
?
I guess that brings me on to the question of why is WinFSP sending O_EXCL
for a file that might already exist? If using a VFS cache then rclone passes that O_EXCL straight on to the underlying Open which then blows up with file already exists
.
This was the issue I was chasing down when I discovered the mismatch between the constants!
The WinFsp native API has an operation Open
for opening an existing file and an operation Create
for creating a new file. Open
is supposed to return an error if the file does not exist and Create
is supposed to return an error if the file already exists.
The FUSE layer is implemented on top of the native WinFsp API. The FUSE implementation of Create
uses O_EXCL
because the file system is supposed to return -EEXIST
if the file already exists.
That makes perfect sense... Except for the issue reported in the forum thread...
Here is the rclone log rclone azure mount copy error (from explorer with fuse debug).txt
What appears to happen is that Create
is called
rclone[TID=24b4]: FFFF9E86CC94E060: >>Create [UT---C] "\test\testfile", FILE_CREATE, CreateOptions=44, FileAttributes=20, Security=NULL, AllocationSize=0:1825000, AccessToken=000000000000029
8[PID=2350], DesiredAccess=17019f, GrantedAccess=0, ShareAccess=0
2022/04/12 15:46:22 DEBUG : /test: Getattr: fh=0xFFFFFFFFFFFFFFFF
2022/04/12 15:46:22 DEBUG : /test: >Getattr: errc=0
2022/04/12 15:46:22 DEBUG : /test/testfile: CreateEx: flags=0x502, mode=0770
2022/04/12 15:46:22 DEBUG : test/testfile: Open: flags=O_RDWR|O_CREATE|O_EXCL
2022/04/12 15:46:22 DEBUG : test/testfile: newRWFileHandle:
2022/04/12 15:46:22 DEBUG : test/testfile(0xc000c1a080): openPending:
2022/04/12 15:46:22 DEBUG : test/testfile: vfs cache: truncate to size=0
2022/04/12 15:46:22 DEBUG : test: Added virtual directory entry vAddFile: "testfile"
2022/04/12 15:46:22 DEBUG : test/testfile(0xc000c1a080): >openPending: err=<nil>
2022/04/12 15:46:22 DEBUG : test/testfile: >newRWFileHandle: err=<nil>
2022/04/12 15:46:22 DEBUG : test: Added virtual directory entry vAddFile: "testfile"
2022/04/12 15:46:22 DEBUG : test/testfile: >Open: fd=test/testfile (rw), err=<nil>
2022/04/12 15:46:22 DEBUG : /test/testfile: >CreateEx: errc=0, fh=0x0
2022/04/12 15:46:22 DEBUG : /test/testfile: Release: fh=0x0
2022/04/12 15:46:22 DEBUG : test/testfile(0xc000c1a080): RWFileHandle.Release
2022/04/12 15:46:22 DEBUG : test/testfile(0xc000c1a080): close:
2022/04/12 15:46:22 DEBUG : test/testfile: vfs cache: setting modification time to 2022-04-12 15:46:22.8018157 +0000 GMT m=+18.086083401
2022/04/12 15:46:22 INFO : test/testfile: vfs cache: queuing for upload in 5s
2022/04/12 15:46:22 DEBUG : test/testfile(0xc000c1a080): >close: err=<nil>
2022/04/12 15:46:22 DEBUG : /test/testfile: >Release: errc=0
rclone[TID=24b4]: FFFF9E86CC94E060: <<Create IoStatus=c0000010[0]
Then immediately called again which fails because the file already exists.
rclone[TID=24b4]: FFFF9E86CC94E060: >>Create [UT---C] "\test\testfile", FILE_CREATE, CreateOptions=44, FileAttributes=20, Security=NULL, AllocationSize=0:1825000, AccessToken=0000000000000298[PID=2350], DesiredAccess=170197, GrantedAccess=0, ShareAccess=0
2022/04/12 15:46:22 DEBUG : /test: Getattr: fh=0xFFFFFFFFFFFFFFFF
2022/04/12 15:46:22 DEBUG : /test: >Getattr: errc=0
2022/04/12 15:46:22 DEBUG : /test/testfile: CreateEx: flags=0x502, mode=0770
2022/04/12 15:46:22 DEBUG : test/testfile: Open: flags=O_RDWR|O_CREATE|O_EXCL
2022/04/12 15:46:22 DEBUG : test/testfile: newRWFileHandle:
2022/04/12 15:46:22 DEBUG : test/testfile: >newRWFileHandle: err=file already exists
2022/04/12 15:46:22 DEBUG : test/testfile: File.openRW failed: file already exists
2022/04/12 15:46:22 DEBUG : test/testfile: >Open: fd=<nil *RWFileHandle>, err=file already exists
2022/04/12 15:46:22 DEBUG : /test/testfile: >CreateEx: errc=-17, fh=0xFFFFFFFFFFFFFFFF
2022/04/12 15:46:22 DEBUG : /test/testfile: Getattr: fh=0xFFFFFFFFFFFFFFFF
2022/04/12 15:46:22 DEBUG : /test/testfile: >Getattr: errc=0
rclone[TID=24b4]: FFFF9E86CC94E060: <<Create IoStatus=c0000035[0]
I haven't been able to replicate this but I suspect it might be Windows 11 / WIndows 2022 datacenter behaviour.
So I don't know exactly what is going on here - probably a misunderstanding by me, but I'd appreciate your thoughts!
I notice that the first log ends with:
rclone[TID=24b4]: FFFF9E86CC94E060: <<Create IoStatus=c0000010[0]
The status code c0000010
stands for STATUS_INVALID_DEVICE_REQUEST
. Think of it as the equivalent of UNIX ENOSYS
. I am not sure why this status code is returned but it has something to do with this failure.
Possible scenario:
Create
is called.
Create
implementation calls FUSE create
, which results in the cgofuse CreateEx
call. This likely succeeds.Create
implementation makes additional calls, one of which fails with -ENOSYS
. The file creation is aborted and the file is closed.Create
returns with with the ENOSYS
equivalent STATUS_INVALID_DEVICE_REQUEST
.Possibilities for things which WinFsp calls into rclone during the create
call:
I don't think this is being called as there is no trace from rclone
I don't think this is being called as there is no trace from rclone either
This part of the log
2022/04/12 15:46:22 DEBUG : /test/testfile: Release: fh=0x0
2022/04/12 15:46:22 DEBUG : test/testfile(0xc000c1a080): RWFileHandle.Release
2022/04/12 15:46:22 DEBUG : test/testfile(0xc000c1a080): close:
2022/04/12 15:46:22 DEBUG : test/testfile: vfs cache: setting modification time to 2022-04-12 15:46:22.8018157 +0000 GMT m=+18.086083401
2022/04/12 15:46:22 INFO : test/testfile: vfs cache: queuing for upload in 5s
2022/04/12 15:46:22 DEBUG : test/testfile(0xc000c1a080): >close: err=<nil>
2022/04/12 15:46:22 DEBUG : /test/testfile: >Release: errc=0
rclone[TID=24b4]: FFFF9E86CC94E060: <<Create IoStatus=c0000010[0]
Implies that release was called, so create was taking the error path here as you suggested.
Given that ops.create
was called (we see that from rclone log) and that ops.chflags
and ops.chown
weren't (because there is no trace for them in the log,
So either CreateEx caused Result to be set to failure (unlikely since we see it returned OK in the rclone log)
Or this went wrong
Or this went wrong
What about the last two options - can they fail with STATUS_INVALID_DEVICE_REQUEST
?
I agree that cgflags
and chown
cannot be the problem. Also note that fsp_fuse_intf_Create
properly handles a STATUS_INVALID_DEVICE_REQUEST
from them.
This could be the problem.
The Windows file creation API is the kitchen sink and then some: the file creation "system call" allows a file to be created together with extended attributes. So this checks if there are extended attributes passed during file creation and calls setxattr
if the file system supports it.
The problem is that in practice this Windows file system capability is almost never used, because it is not exposed by the regular Windows API (CreateFileW
). So it is unclear to me what regular file system scenario would cause extended attributes to be sent in the Create
call. Perhaps I should read the original thread from the rclone forum more closely.
In any case, it might be prudent to add a check for STATUS_INVALID_DEVICE_REQUEST
here and treat it as success as we do for chown
and chflags
.
Re: fsp_fuse_intf_GetFileInfoEx
This issues an fgetattr
to the file system to get the final stat
data of the created file. So it is not likely that this would generate a STATUS_INVALID_DEVICE_REQUEST
return.
Re-reading the code of Create
I see that if the file system is passed extended attributes during file creation and setxattr
is not implemented, then we are supposed to return STATUS_EAS_NOT_SUPPORTED
.
This suggests that if the failure to set extended attributes is indeed the problem, we may not be able to ignore a -ENOSYS
error and treat it as success. So unfortunately I do not have a good fix for this if it turns out to be the problem.
This could be the problem.
The Windows file creation API is the kitchen sink and then some: the file creation "system call" allows a file to be created together with extended attributes. So this checks if there are extended attributes passed during file creation and calls
setxattr
if the file system supports it.
Rclone does indeed return ENOSYS for any of the *xattr
calls. If any of those got called they wouldn't get logged as there are no Trace statements.
// Setxattr sets extended attributes.
func (fsys *FS) Setxattr(path string, name string, value []byte, flags int) (errc int) {
return -fuse.ENOSYS
}
// Getxattr gets extended attributes.
func (fsys *FS) Getxattr(path string, name string) (errc int, value []byte) {
return -fuse.ENOSYS, nil
}
// Removexattr removes extended attributes.
func (fsys *FS) Removexattr(path string, name string) (errc int) {
return -fuse.ENOSYS
}
// Listxattr lists extended attributes.
func (fsys *FS) Listxattr(path string, fill func(name string) bool) (errc int) {
return -fuse.ENOSYS
}
I'll add Trace statements and get the user to test again and we can see if your theory is correct.
Please let me know.
In the meantime I will fix the mixed up values for the O_EXCL
and O_TRUNC
values in the no-cgo build.
I got a log back from the user. It shows quite clearly that Setxattr
was called to set the extended attribute $KERNEL.PURGE.SEC.FILEHASH
as part of the creating the file and that rclone returned ENOSYS from it.
So it looks like your theory is correct.
rclone[TID=3750]: FFFFD389126F0DC0: >>Create [UT---C] "\test\testfile", FILE_CREATE, CreateOptions=44, FileAttributes=20, Security=NULL, AllocationSize=0:1825000, AccessToken=0000000000000690[PID=19a8], DesiredAccess=17019f, GrantedAccess=0, ShareAccess=0
2022/04/21 12:18:59 DEBUG : /test: Getattr: fh=0xFFFFFFFFFFFFFFFF
2022/04/21 12:18:59 DEBUG : /test: >Getattr: errc=0
2022/04/21 12:18:59 DEBUG : /test/testfile: CreateEx: flags=0x502, mode=0770
2022/04/21 12:18:59 DEBUG : test/testfile: Open: flags=O_RDWR|O_CREATE|O_EXCL
2022/04/21 12:18:59 DEBUG : test/testfile: newRWFileHandle:
2022/04/21 12:18:59 DEBUG : test/testfile(0xc00005d0c0): openPending:
2022/04/21 12:18:59 DEBUG : test/testfile: vfs cache: truncate to size=0
2022/04/21 12:18:59 DEBUG : test: Added virtual directory entry vAddFile: "testfile"
2022/04/21 12:18:59 DEBUG : test/testfile(0xc00005d0c0): >openPending: err=<nil>
2022/04/21 12:18:59 DEBUG : test/testfile: >newRWFileHandle: err=<nil>
2022/04/21 12:18:59 DEBUG : test: Added virtual directory entry vAddFile: "testfile"
2022/04/21 12:18:59 DEBUG : test/testfile: >Open: fd=test/testfile (rw), err=<nil>
2022/04/21 12:18:59 DEBUG : /test/testfile: >CreateEx: errc=0, fh=0x0
2022/04/21 12:18:59 DEBUG : /test/testfile: Setxattr: name="$KERNEL.PURGE.SEC.FILEHASH", value="\x037\xedY\xa6f\x99\xc0\xe5\xa7\xeb\xee\xf75-|\x1c.\xd5\xed\xe7!)P\xa1\xb0\xa8\xee(\x9a\xf4\xa9[wA\xa5\xca\xd28\xce>L\xa7u`X\xf2\xa6zW\xfe\xe9\xd1J\x85\xbf\xd4O\t\xefFg\x9f\xaf\xcb\x1b\xabbz\xfb\xea\x9a_\xabZ\x97k\xb7\xa7鯟ݹ^\xfe\x97f_\xee\x99u\xadW\xfa{\xba\x99v}\xadw\xfd\xeb]vYj\u05f7\xad\xd9\xd5\xd9mw\xdd\xf9oZ\xb6n\xafo\xa7\xe5\xaf_潪zw\x00\x00\x00\x11)\xd6ա\x1dy\xb1W\xab\x81\x8a", flags=0
2022/04/21 12:18:59 DEBUG : /test/testfile: >Setxattr: errc=-40
2022/04/21 12:18:59 DEBUG : /test/testfile: Release: fh=0x0
2022/04/21 12:18:59 DEBUG : test/testfile(0xc00005d0c0): RWFileHandle.Release
2022/04/21 12:18:59 DEBUG : test/testfile(0xc00005d0c0): close:
2022/04/21 12:18:59 DEBUG : test/testfile: vfs cache: setting modification time to 2022-04-21 12:18:59.7972879 +0000 GMT m=+12.082671701
2022/04/21 12:18:59 INFO : test/testfile: vfs cache: queuing for upload in 5s
2022/04/21 12:18:59 DEBUG : test/testfile(0xc00005d0c0): >close: err=<nil>
2022/04/21 12:18:59 DEBUG : /test/testfile: >Release: errc=0
rclone[TID=3750]: FFFFD389126F0DC0: <<Create IoStatus=c0000010[0]
Re-reading the code of
Create
I see that if the file system is passed extended attributes during file creation andsetxattr
is not implemented, then we are supposed to returnSTATUS_EAS_NOT_SUPPORTED
.This suggests that if the failure to set extended attributes is indeed the problem, we may not be able to ignore a
-ENOSYS
error and treat it as success. So unfortunately I do not have a good fix for this if it turns out to be the problem.
Here are some ideas...
ENOSYS
into a STATUS_EAS_NOT_SUPPORTED
at the point the *xattr is called. There don't appear to be too many of these. A variant of fsp_fuse_ntstatus_from_errno maybe fsp_fuse_ntstatus_from_xattr_errno
?fuse_operations
structure for unsupported methods in cgofuse. For example we could make a call which takes a []string
of unsupported fuse_operations
and these could get zeroed out in the ops structure. So rather like the host.SetCapCaseInsensitive
method, we could make a host.SetUnsupported(names []string)
method.What do you think? Any of those sound like a good idea?
I got a log back from the user. It shows quite clearly that Setxattr was called to set the extended attribute $KERNEL.PURGE.SEC.FILEHASH as part of the creating the file and that rclone returned ENOSYS from it.
Thanks for confirming.
I find it rather unusual that anyone would try to set Extended Attributes (EAs in Windows parlance) during the Create
call. I certainly have not observed it in the wild outside of specialized software or test apps. I note that this feature is not exposed to applications via the regular Win32 API. The EA name of $KERNEL.PURGE.SEC.FILEHASH
seems suggestive of something although I am unsure what.
The sensible fix here is to assume success when setxattr
fails with -ENOSYS
during Create
. This is what happens for other calls such as chown
and chflags
. The reason that I said earlier that "I do not have a good fix for this" is because the FILE_NO_EA_KNOWLEDGE
option to the NtCreateFile
system call complicates handling of EAs (when this flag is set we are supposed to fail Create
calls that do not understand EAs -- however this may not be important in practice).
An additional fix would be to avoid enabling EA support during initialization of the file system. Currently this is done with the following code:
if (0 != f->ops.listxattr && 0 != f->ops.getxattr &&
0 != f->ops.setxattr && 0 != f->ops.removexattr)
f->VolumeParams.ExtendedAttributes = 1;
Obviously this does not work if the file system has empty definitions of these operations that return -ENOSYS
, like cgofuse does. A potential workaround is to issue a getxattr
call to the root directory (/
) and treat an -ENOSYS
response as an indicator that the file system does not support EAs.
I will look look into implementing either the first one or both of these fixes in WinFsp-FUSE, which should resolve this issue.
rclone could just return success for the xattr calls and null results. Probably not ideal and I'm not sure what Windows would do in this case if rclone lost the extended attributes it was supposed to keep.
FUSE file systems already "approximate" the Windows file system. For example, FUSE file systems do not support full ACLs, file attributes or reparse points. I think there are fewer Windows applications that care about EAs rather than other features that FUSE does not support.
we could figure out a way to get NULLs into the fuse_operations structure for unsupported methods in cgofuse. For example we could make a call which takes a []string of unsupported fuse_operations and these could get zeroed out in the ops structure. So rather like the host.SetCapCaseInsensitive method, we could make a host.SetUnsupported(names []string) method.
That is an interesting idea and worth keeping in mind if we fail to provide a simpler fix.
If you compile and run this program
It prints this
However if you disable CGO and compile it you get a different answer with O_EXCL and O_TRUNC swapped over.
I believe the non CGO version is correct, at least that agrees with what I receive from WinFSP. It is defined here
https://github.com/winfsp/cgofuse/blob/f0ad03169e0f07409feb7e3db95fba57fab5cfc6/fuse/fsop_nocgo_windows.go#L107-L108
whereas the CGO version is defined here - this also looks correct
https://github.com/winfsp/cgofuse/blob/f0ad03169e0f07409feb7e3db95fba57fab5cfc6/fuse/fsop_cgo.go#L249-L250
So something a little wacky is going on!
This is compiling on Windows 10 with go1.18.1 and gcc 8.1.0 provided by MinGW using latest released cgofuse v1.5.0 and latest stable WinFSP (2022 installed from
1.10.220006.msi
).This was originally reported in the rclone forum here: https://forum.rclone.org/t/cannot-copy-files-to-mounted-azure-storage-windows/30092 if you want some background.
I suspect that the problem might be with the WinFSP include files, but I couldn't figure out where the constants were defined.
You said before in https://github.com/winfsp/cgofuse/issues/19#issuecomment-344405666 where we discussed a very similar issue
So I wonder if I'm including the Cygwin version of the fuse header files or something like that? This is the include I use
Any help gratefully received!