pkg / sftp

SFTP support for the go.crypto/ssh package
BSD 2-Clause "Simplified" License
1.5k stars 379 forks source link

Provide some FS API for Server #95

Closed an2deg closed 7 years ago

an2deg commented 8 years ago

In my project I would like to use github.com/pkg/sftp as an embedded SFTP server. Would be nice to make Server more extendable in terms of working with FS or having the following features:

  1. Restrict user to use only specific directory (e.g. chroot)
  2. Provide some kind of notifications when file upload is completed or user session is closed

In my pull request #94 I've moved all file operations into FileStorageBackend and made it replaceable. Maybe this solution is little bit dirty so let's workshop here the better solution

davecheney commented 8 years ago

Trying to scope the connection to a specific director is #92. I'm very cautious about doing anything here because I don't want to over promise. The second we say that the sftp server cannot break out of this chroot, we're writing a cheque that someone else is trying to cash.

With respect to the rest of this issue, a file upload notification should not require adding an entire virtual fs layer, and closing a user session has nothing to do with a virtual fs layer.

I'm really not keen on adding a virutal fs layer because there are always so many around for Go (there are two in the x/tools repo) and none have attracted any sort of following, adding yet another won't help the situation. Is there is another way to acomplish these goals without adding a vfs layer ?

an2deg commented 8 years ago

What do you think about using callbacks?

For example let's use the signature for callback function: type SftpCallback(pktType, serverRespondablePacket) (serverRespondablePacket, error)

And Server structure will have a field with that type which by default could be null or a function with does nothing .

With those changes we can use it in sftpServerWorker() function right after decodePacket():

    dPkt, err := svr.decodePacket(pkt.pktType, pkt.pktBytes)
    if err != nil {
        fmt.Fprintf(svr.debugStream, "decodePacket error: %v\n", err)
        doneChan <- err
        return
    }

            dPkt, err = srv.callback(pkt.pktType, dPkt)
            if err != nil {
        fmt.Fprintf(svr.debugStream, "callback error: %v\n", err)
        doneChan <- err
        return
    }

It will allow a user to react to some types of packets. For example we can do some emulation of chroot here and send some notifications.

My usecase is relatively simple: I want to allow user to upload some files by SFTP and when upload finished - start processing those files. Maybe you could come up with better solution.

eikenb commented 8 years ago

I am also interested in more flexibility for server side storage. In my case it is to implement an SFTP to S3 gateway service. I was considering a similar abstraction/interface for either the files or the filesystem as well.

Issue #66 and #84 are related to this and have similar proposals and in the discussion of issue #66 another use case was to provide access to "database/compressed archive". In the discussion there was also a comment by @marksheahan about how it might make more sense to wrap the packets instead of the filesystem. This leads to an idea similar to #84 except the routing/handlers would be based on the incoming packet types. Handlers based on packets would look a lot more like a set of callbacks, but modeling them on the http handlers as #84 suggests would be a nice way to present the API.

I'm working on this project now and am trying to avoid a SFTP/S3FS hack with a proper gateway service so am motivated to start working on something as soon as there is a consensus.

davecheney commented 8 years ago

@eikenb I'm sorry but I don't have time to drive @mdlayher 's #66. It's up to you two to work on that. If you can do it in a way that is decoupled from this package, even better, then I won't be a blocker on your work.

I would be happy to change this package to work with some kind of vfs shaped thing with the following requirements:

Thanks

Dave

eikenb commented 8 years ago

@davecheney .. As far as VFS things go, what about @an2deg's pull request #94? I'm not 100% that a filesystem abstraction is the way to go, but if so then that one seems decent at first glance.

davecheney commented 8 years ago

94 is sadly what I do not want to see. It's a huge interface, and will

force this package into the business of competing with all the other huge vfs implementations already proposed. None of these existing implementations have achieved any traction, so it don't see why adding yet another would do any better.

Sorry about this, but I don't want to see this package being used as a beachhead for fighting a war about a vfs implementation in Go.

On Tue, 10 May 2016, 06:37 John Eikenberry, notifications@github.com wrote:

@davecheney https://github.com/davecheney .. As far as VFS things go, what about @an2deg https://github.com/an2deg's pull request #94 https://github.com/pkg/sftp/pull/94? I'm not 100% that a filesystem abstraction is the way to go, but if so then that one seems decent at first glance.

— You are receiving this because you were mentioned.

Reply to this email directly or view it on GitHub https://github.com/pkg/sftp/issues/95#issuecomment-217981875

eikenb commented 8 years ago

The size of that interface is off-putting, but given SFTP's protocol I don't think there is a way around having a large number of calls on the interface if you try to abstract the backend out at the file/filesystem level.

I'm thinking a system based on packets instead of the filesystem would work much better. For each packet-type you would register a packet/struct that meets the serverRespondablePacket{} interface and the packet's respond() method could do whatever you needed. The packet structs and a few other things would need to be made public and the big case statement in decodePacket() would be replaced by a function that would register the 'default' file based handlers.

eikenb commented 8 years ago

I have a basic version working that allows for redefining the packet handling. It works well and doesn't require big changes aside from exporting a lot of formerly private stuff.

I plan to clean it up and submit as 2 pull requests. The first will only do the renaming to export what is needed for the external handlers. The second will add the handler registration functions, change decodePacket() to use them and register the existing default (filesystem) handlers. Having the pull requests to reference seems like the easiest way to talk about it but if you would prefer something else please just let me know. I'll get the pull requests done in the next few days.

davecheney commented 8 years ago

Sgtm.

On Fri, 13 May 2016, 08:31 John Eikenberry, notifications@github.com wrote:

I have a basic version working that allows for redefining the packet handling. It works well and doesn't require big changes aside from exporting a lot of formerly private stuff.

I plan to clean it up and submit as 2 pull requests. The first will only do the renaming to export what is needed for the external handlers. The second will add the handler registration functions, change decodePacket() to use them and register the existing default (filesystem) handlers. Having the pull requests to reference seems like the easiest way to talk about it but if you would prefer something else please just let me know. I'll get the pull requests done in the next few days.

— You are receiving this because you were mentioned.

Reply to this email directly or view it on GitHub https://github.com/pkg/sftp/issues/95#issuecomment-218904580

oleksandr commented 8 years ago

@eikenb is there any chance to see your pull requests? I'm struggling with exactly the same feature: SFTP->S3 and S3FS is rather a compromise 👎

eikenb commented 8 years ago

The pull requests are up. Pull request #96 contains the name changes to export what is needed for externalizing packet handling. Pull request #97 contains an abstracted out Responder interface that lets you specify new packet handlers. PR #97 probably needs some additional work for the final version.

Please let me know what you think. Thanks.

davecheney commented 8 years ago

Thank you for sending #97, but this is not what I thought was going to happen. I thought that you were proposing something like the http api,

sftp.Serve(path, handlerFunc())

Where handler would get some kind of request structure that had the operation, READ, OPEN, READDIR, DELETE, etc as well as a few bits of metadata like the path to operate on.

This is the API I was expecting to see, I'm sorry if I did not make myself clear.

eikenb commented 8 years ago

That "some kind of request structure"... in order to be extended wouldn't that need to be a large VFS-like interface that you wanted to avoid? I guess it could use a callback system to specify the behavior of each operation instead of an interface. Is that what you had in mind?

davecheney commented 8 years ago

I was thinking of something like http.Request, an operation "OPEN", "DELETE", "READDIR", and a path, possibly two. Have a look at the various inotify fswatch packages out there, they have a structure like this, an operation, what happened, and a few pieces of metadata that give extra detail, usually one path, but sometimes two in the case of rename.

On Wed, May 18, 2016 at 4:03 AM, John Eikenberry notifications@github.com wrote:

That "some kind of request structure"... in order to be extended wouldn't that need to be a large VFS-like interface that you wanted to avoid? I guess it could use a callback system to specify the behavior of each operation instead of an interface. Is that what you had in mind?

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/pkg/sftp/issues/95#issuecomment-219802655

eikenb commented 8 years ago

Thanks for the feedback here an on the PRs. I agree that the packet-based setup in the PRs was not that great, but I thought it better to propose something and get feedback than discuss things sans code. I'll be back after I look into the inotify packages.

After a quick googling I found a good number of them, maybe you could chime in on which ones you were thinking of..

https://github.com/fsnotify/fsnotify https://github.com/rjeczalik/notify https://github.com/fsnotify/fsevents https://github.com/codeskyblue/fswatch https://github.com/cortesi/modd https://github.com/emcrisostomo/fswatch https://github.com/Unknwon/bra

davecheney commented 8 years ago

I don't have a preferred one. The inspiration comes from the underlying kevent struct that Linux returns which is just an operation and one or two paths (other inotify methods in other os's work similarly but lack metadata which makes it hard to detect operations like rename)

On Thu, 19 May 2016, 06:28 John Eikenberry, notifications@github.com wrote:

Thanks for the feedback here an on the PRs. I agree that the packet-based setup in the PRs was not that great, but I thought it better to propose something and get feedback than discuss things sans code. I'll be back after I look into the inotify packages.

After a quick googling I found a good number of them, maybe you could chime in on which ones you were thinking of..

https://github.com/fsnotify/fsnotify https://github.com/rjeczalik/notify https://github.com/fsnotify/fsevents https://github.com/codeskyblue/fswatch https://github.com/cortesi/modd https://github.com/emcrisostomo/fswatch https://github.com/Unknwon/bra

— You are receiving this because you were mentioned.

Reply to this email directly or view it on GitHub https://github.com/pkg/sftp/issues/95#issuecomment-220147982

eikenb commented 8 years ago

The inotify packages basically have an event handling system where the event has a type and some data. This works well for what they are intended (event hooks) but lacks the abstraction for communicating back to the client. That requires a means of responding that would leverage the existing marshalling and packet handling code.

I propose a simple Handle method that would take one of a half dozen-ish Interfaces as callbacks to abstract the actual data handling. I'm thinking it would keep things at a higher level than the packet handling of the existing server. Like it would get a single Read event for the whole file instead of per packet/offset. I think all the functionality can be captured using these interfaces (I'm not sure about all the names).

Reader (give data until done, then EOF) Writer (take data until done, then EOF) Opener (take filename, pflags, attrs and return string handle) FileActor (close, delete, etc. that take filename and return status) Renamer (special case, 2 names) Stater (take name return attributes) ReadDirer (take dirname and return file list of path and attrs)

I know this is fairly abstract but hopefully it will give you an idea of what I'm thinking so you can voice any concerns. I think I should be able to dedicate more time to it now, having finished working on a different project I was pulled onto for the last few weeks.

eikenb commented 8 years ago

Reading it back again I'm not sure it was clear.. the API would still have the event based system like the inotify packages. It would additionally have the Handle interface as the means to communicate back to the client. The API this presents for people wanting to write custom backends would be that they'd receive the event, build the object with the appropriate interface and pass that to the Handle method.

eikenb commented 8 years ago

I'm making progress on this and some of the above things are not quite right. So just ignore them for now and I'll submit a pull request when I have something ready for comment.

eikenb commented 8 years ago

I'm close to having it ready for a pull request. I have the interfaces needed to handle the various things boiled down to 4 different types which vary based on the required return values. If you are curious, you can see the current state in the request branch of my fork (https://github.com/eikenb/sftp/tree/request). It is a decent amount of code (~1000 lines including tests) but doesn't require changing any of the existing code. The files for it are all prepended with 'request' (eg. request-server.go).

It needs a few more tests and some basic documentation before it will be ready for the pull request.

eikenb commented 8 years ago

I just posted Pull Request #127. It contains the work I've been doing on the inotify and net/http inspired SFTP backend. I mentioned this issue in my pull-request comment. It only introduces new files, not touching any existing code. Please take a look at your earliest convenience. Thanks.

eikenb commented 8 years ago

Redid Pull request, now is #129. Same as before, just with fewer bugs.

eikenb commented 8 years ago

Found the data race bug and fixed with locks but the locks really shouldn't be necessary, so I am reviewing request data structure to better deal with the access patterns.

eikenb commented 8 years ago

Redid Pull request; #130

Used every code checker I could think of.. 3rd times a charm?

eikenb commented 7 years ago

With my request based server merged, I'm saying this is done.