winfsp / cgofuse

Cross-platform FUSE library for Go - Works on Windows, macOS, Linux, FreeBSD, NetBSD, OpenBSD
https://winfsp.dev
MIT License
511 stars 82 forks source link

Can't set direct_io for open files #38

Closed ncw closed 4 years ago

ncw commented 4 years ago

This is very related to #5 but a different need!

I'd like to be able to set direct_io on certain files. In particular I'd like to be able to set it on files where rclone doesn't know the length in advance (eg Google docs files).

Rclone reports the length of these files as 0 and without this flag 0 bytes is all that can be read. However setting this flag allows the files to be read even though their size reads as 0.

I imaging this is how the files in /proc are implemented which all have 0 size but you can still cat them.

This works very well with linux allowing these strange 0 size files to function a bit better.

It looks like OSXFUSE supports this too (well that is the mount option - I imagine the Open option is supported too).

Is this something which can be done in Windows file systems?

To implement this it would need the interface changing to the Open and Create calls. If you wanted to be 100% compatible then this could return a specific errc meaning everything OK but set the direct io flag. Or we could make an extended Open call which is used in preference over Open.

I'd be happy to help with this if you think it is a good idea.

billziss-gh commented 4 years ago

direct_io on cgofuse

This should be easy to fix. We should keep it backwards compatible or bump up the version to 2.0. (Bumping it to 2.0 would also allow us to fix ugliness such as FileSystemChflags, etc.)

direct_io on WinFsp

Things are complicated:

ncw commented 4 years ago

Do you think just doing the first part direct_io on cgofuse would be an improvement to the status quo?

That would help users of cgofuse on non Windows platforms.

billziss-gh commented 4 years ago

I am happy to add direct_io support as long as it is understood that it will not work as intended on WinFsp.

What do you propose the interface for it to be? I would think a new OpenEx or Open2 call that allows the file system to return the full fuse_file_info structure somehow.

ncw commented 4 years ago

I am happy to add direct_io support as long as it is understood that it will not work as intended on WinFsp.

Great!

What do you propose the interface for it to be? I would think a new OpenEx or Open2 call that allows the file system to return the full fuse_file_info structure somehow.

The current prototype is

Open(path string, flags int) (errc int, fh uint64)

I'd propose something like

OpenEx(path string, flags int) (errc int, fh uint64, openFlags uint32)

Where openFlags is a sanitized version of the flags from fuse_file_info

These are the only ones bazil fuse supports. Not sure what the OS X ones do!

    OpenDirectIO    OpenResponseFlags = 1 << 0 // bypass page cache for this open file
    OpenKeepCache   OpenResponseFlags = 1 << 1 // don't invalidate the data cache on open
    OpenNonSeekable OpenResponseFlags = 1 << 2 // mark the file as non-seekable (not supported on OS X)

    OpenPurgeAttr OpenResponseFlags = 1 << 30 // OS X
    OpenPurgeUBC  OpenResponseFlags = 1 << 31 // OS X
billziss-gh commented 4 years ago

IMO if we introduce OpenEx we should support the full native fuse_file_info to anticipate future requirements.

So I propose that the OpenEx method has a prototype as below (assuming a suitably defined FileInfo struct). The file system implementation is expected to fill the FileInfo struct (similar to what is being done in GetAttr for Stat_t).

/*
 * Go
 */

type FileInfo struct {
    // ...
}

// ...

OpenEx(path string, flags int, fi *FileInfo) int

For reference here are the C definitions for FUSE2 and FUSE3:

/*
 * C
 */

// FUSE2 fuse_file_info
struct fuse_file_info
{
    int flags;
    unsigned int fh_old;
    int writepage;
    unsigned int direct_io:1;
    unsigned int keep_cache:1;
    unsigned int flush:1;
    unsigned int nonseekable:1;
    unsigned int padding:28;
    uint64_t fh;
    uint64_t lock_owner;
};

// FUSE3 fuse_file_info
struct fuse3_file_info
{
    int flags;
    unsigned int writepage:1;
    unsigned int direct_io:1;
    unsigned int keep_cache:1;
    unsigned int flush:1;
    unsigned int nonseekable:1;
    unsigned int flock_release:1;
    unsigned int padding:27;
    uint64_t fh;
    uint64_t lock_owner;
    uint32_t poll_events;
};

What timeframe would you need this or a similar change in?

ncw commented 4 years ago

IMO if we introduce OpenEx we should support the full native fuse_file_info to anticipate future requirements.

That looks like it is future proof :-)

What timeframe would you need this or a similar change in?

I haven't got a pressing need for this if Windows can't support direct_io, it would just enable the two different mounting systems to achieve parity in rclone.

This should be easy to fix. We should keep it backwards compatible or bump up the version to 2.0. (Bumping it to 2.0 would also allow us to fix ugliness such as FileSystemChflags, etc.)

Note that there be dragons with go modules and going from v1 to v2 if you haven't seen that already.

billziss-gh commented 4 years ago

I have a couple of fires to fight right now, plus my regular development of winfuse. If I have not done this in a week or 2, please ping me!

billziss-gh commented 4 years ago

My fires ended up fighting themselves for now :) So I found some time to get this done. See commit: 682213d345c0dfa666bdb18fe641cc3a586f9c82. Feedback welcome.

Of course this raises the question of whether we should support FileInfo_t on Read, Write, etc. as is done in native FUSE with struct fuse_file_info. I do not think so, but wanted to raise this nevertheless.

EDIT 1: the OpendirEx method may be superfluous.

EDIT 2: here is an alternative commit without OpendirEx: 8e41a0ce969b7c07d55751a9ad4e1941f285f620

billziss-gh commented 4 years ago

Ping @ncw

ncw commented 4 years ago

My fires ended up fighting themselves for now :)

The best kind of fires!

Apologies for my own delay responding.

So I found some time to get this done. See commit: 682213d. Feedback welcome.

Of course this raises the question of whether we should support FileInfo_t on Read, Write, etc. as is done in native FUSE with struct fuse_file_info. I do not think so, but wanted to raise this nevertheless.

Looking at the fuse docs almost none of those flags apply to read and write. Those that do, do not look useful to me so I'd agree that not supporting FileInfo_t on anything other than Open is probably sufficient.

EDIT 1: the OpendirEx method may be superfluous.

The only flag that opendir uses is cache_readdir which

signals the kernel to enable caching of entries returned by readdir().

I think that for fuse file systems the global control over direntry caching is surely enough.

EDIT 2: here is an alternative commit without OpendirEx: 8e41a0c

I think I prefer that. Narrowing the interface is good.

In fact we might consider narrowing the interface further to this which would have the advantage that the user need only implement one or the other. This appears more go style to me.

type FileSystemCreateEx interface {
   CreateEx(path string, mode uint32, fi *FileInfo_t) int
 }

type FileSystemOpenEx interface {
   OpenEx(path string, fi *FileInfo_t) int
 }
billziss-gh commented 4 years ago

In fact we might consider narrowing the interface further to this which would have the advantage that the user need only implement one or the other. This appears more go style to me.

type FileSystemCreateEx interface {
  CreateEx(path string, mode uint32, fi *FileInfo_t) int
 }

type FileSystemOpenEx interface {
  OpenEx(path string, fi *FileInfo_t) int
 }

That was my first attempt, but then I run into the implementation of hostCreate:

func hostCreate(path0 *c_char, mode0 c_fuse_mode_t, fi0 *c_struct_fuse_file_info) (errc0 c_int) {
    defer recoverAsErrno(&errc0)
    fsop := hostHandleGet(c_fuse_get_context().private_data).fsop
    path := c_GoString(path0)
    intf, ok := fsop.(FileSystemOpenEx)
    if ok {
        fi := FileInfo_t{Flags: int(fi0.flags)}
        errc := intf.CreateEx(path, uint32(mode0), &fi)
        if -ENOSYS == errc {
            errc = fsop.Mknod(path, S_IFREG|uint32(mode0), 0)
            if 0 == errc {
                errc = intf.OpenEx(path, &fi)
            }
        }
        c_hostAsgnCfileinfo(fi0,
            c_bool(fi.DirectIo),
            c_bool(fi.KeepCache),
            c_bool(fi.NonSeekable),
            c_uint64_t(fi.Fh))
        return c_int(errc)
    } else {
        errc, rslt := fsop.Create(path, int(fi0.flags), uint32(mode0))
        if -ENOSYS == errc {
            errc = fsop.Mknod(path, S_IFREG|uint32(mode0), 0)
            if 0 == errc {
                errc, rslt = fsop.Open(path, int(fi0.flags))
            }
        }
        fi0.fh = c_uint64_t(rslt)
        return c_int(errc)
    }
}

If we have 2 separate interfaces then the implementation becomes more complicated. Of course this is not the real problem, but it made me realize that a file system that supports OpenEx should probably also support CreateEx. The semantics of Create is to create a new file and Open it. Likewise the semantics of CreateEx are to create a new file and OpenEx it.

ncw commented 4 years ago

That was my first attempt, but then I run into the implementation of hostCreate:

If we have 2 separate interfaces then the implementation becomes more complicated. Of course this is not the real problem, but it made me realize that a file system that supports OpenEx should probably also support CreateEx. The semantics of Create is to create a new file and Open it. Likewise the semantics of CreateEx are to create a new file and OpenEx it.

That is a good argument that they should be in the same interface.

OK so EDIT 2 for the win?

billziss-gh commented 4 years ago

I have merged commit 8e41a0ce969b7c07d55751a9ad4e1941f285f620 ("EDIT 2") into master. Please pull and see if it works for you.

billziss-gh commented 4 years ago

@ncw let me know if you are happy with these changes so we can close this.

s3rj1k commented 4 years ago

@billziss-gh Can we have a simple example in how to use direct_io ?

ncw commented 4 years ago

@ncw let me know if you are happy with these changes so we can close this.

Sorry for the long delay!

I've implemented this now and it works very well thank you @billziss-gh

To answer @s3rj1k 's question - you'll need to implement the optional OpenEx and CreateEx methods. If you do that you can set direct io in them. Here is a snippet from rclone code

func (fsys *FS) OpenEx(path string, fi *fuse.FileInfo_t) (errc int) {
    defer log.Trace(path, "flags=0x%X", fi.Flags)("errc=%d, fh=0x%X", &errc, &fi.Fh)
    fi.Fh = fhUnset

    // translate the fuse flags to os flags
    flags := translateOpenFlags(fi.Flags)
    handle, err := fsys.VFS.OpenFile(path, flags, 0777)
    if err != nil {
        return translateError(err)
    }

    // If size unknown then use direct io to read
    if entry := handle.Node().DirEntry(); entry != nil && entry.Size() < 0 {
        fi.DirectIo = true
    }

    fi.Fh = fsys.openHandle(handle)
    return 0
}
billziss-gh commented 4 years ago

@ncw thanks. I am closing this and will follow up with a new release shortly.