pkg / sftp

SFTP support for the go.crypto/ssh package
BSD 2-Clause "Simplified" License
1.5k stars 379 forks source link

request server: handle relative symlinks #526

Closed georgmu closed 1 year ago

georgmu commented 1 year ago

The request server implementation does not handle symlinks correctly. If the link target is relative it is rewritten to an absolute path. The first commit fixes that.

The second commit is just adding a testcase for linking to non-existent files.

drakkan commented 1 year ago

I don't have time to test this change now (I'll do it later today or this weekend), I think this change will allow symlinks outside the base directory. Applications enforcing a chroot such as SFTPGo should check the symlink target and normalize it themselves after this change. This is acceptable but maybe should be documented as a backward incompatible change.

georgmu commented 1 year ago

I just did a basic test with SFTPGo: Escaping the chroot (using cd or get) didn't seem to work - fine so far.

SFTPGo is restricting the link target of the symlink command pretty strong - but since symlinks could exist from other sources (directly in filesystem using ln)

But yes, SFTPGo would need to change its implementation to handle relative symlinks correctly. Maybe I should file a sepearate bug in SFTPGo, since symlink handling of relative paths is pretty easy to trigger using the command-line sftp command:

$ echo bar > bar
$ sftp -P 2022 user@localhost
Connected to localhost.
sftp> mkdir foo
sftp> cd foo
sftp> put bar
Uploading bar to /foo/bar
bar
sftp> rename bar renamed-bar # rename here if local pwd is same as home directory of user
sftp> symlink renamed-bar linked-bar
sftp> ls -l
lrwxrwxrwx    1 1000     1000           23 Sep 29 15:34 linked-bar
-rw-r--r--    1 1000     1000            4 Sep 29 15:33 renamed-bar
sftp> get linked-bar
Fetching /foo/linked-bar to linked-bar
Couldn't stat remote file: No such file or directory
sftp> ^D
$ ls -l foo/
insgesamt 8
lrwxrwxrwx. 1 user user 23 29. Sep 15:34 linked-bar -> /home/user/renamed-bar
-rw-r--r--. 1 user user  4 29. Sep 15:33 renamed-bar

Symlink linked-bar should either be relative (best option) or be /home/user/foo/renamed-bar (but this has the drawback of destroying the link by renaming directory foo to foo2)

georgmu commented 1 year ago

Just as a note: The last comment and tests were about the current SFTPGo implementation without the relative symlink patch applied to sftp.

drakkan commented 1 year ago

Just as a note: The last comment and tests were about the current SFTPGo implementation without the relative symlink patch applied to sftp.

Yes, I noticed this morning, after your initial comment that relative symlinks are not handled correctly in SFTPGo. I'll give a look this weekend, thanks

drakkan commented 1 year ago

I think this was broken by me here. @georgmu can you please revert that commit (or revert to the previous behavior) and confirm that your use case is working?

Ignore this comment. Even before my change pkg.getPath get cleaned in the NewRequest method.

drakkan commented 1 year ago

I've done some testing and this patch looks good to me. I think we just need to document somewhere that Filepath can be relative for symbolic links now. @puellanivis do you have any suggestions other than the above for test cases?

drakkan commented 1 year ago

@georgmu please add some test cases for relative links from a dir different from /, something like this:

sftp> pwd
Remote working directory: /dir
sftp> symlink link1.txt sub/file.txt
sftp> symlink sub/link2.txt file.txt

Thanks

puellanivis commented 1 year ago

I've done some testing and this patch looks good to me. I think we just need to document somewhere that Filepath can be relative for symbolic links now. @puellanivis do you have any suggestions other than the above for test cases?

Nope, not really. Tried my hardest, and only the test code was something I could see anything at all in.

georgmu commented 1 year ago

I will add some more testcases, also the one mentioned in my comment about moving the parent directory. I will also try the testcases with 98b35dcf reverted (just to check that it worked before).

georgmu commented 1 year ago
sftp> pwd
Remote working directory: /dir
sftp> symlink link1.txt sub/file.txt
sftp> symlink sub/link2.txt file.txt

just to make sure what these commands do as checked with sftp CLI client and standard sftp server:

I would reverse the names so that the symlinks are named link*.txt and the files they are pointing to named file*.txt

drakkan commented 1 year ago
sftp> pwd
Remote working directory: /dir
sftp> symlink link1.txt sub/file.txt
sftp> symlink sub/link2.txt file.txt

just to make sure what these commands do as checked with sftp CLI client and standard sftp server:

  • symlink link1.txt sub/file.txt will create symlink /dir/sub/file.txt

    • /dir/sub has to exist
    • link target: link1.txt (which would effectively be file /dir/sub/link1.txt)
  • symlink sub/link2.txt file.txt will create symlink /dir/file.txt

    • /dir/sub does not have to exist
    • link target: sub/link2.txt (which would effectively be file /dir/sub/link2.txt)

I would reverse the names so that the symlinks are named link*.txt and the files they are pointing to named file*.txt

Yes, reverse the names, thanks

georgmu commented 1 year ago

@georgmu please add some test cases for relative links from a dir different from /, something like this:


sftp> pwd
Remote working directory: /dir

Thanks

As I understand sftp, pwd is solely a client feature keeping track of the current directory. So the unit tests can only use /dir/sub/... etc. ...

drakkan commented 1 year ago

@georgmu please add some test cases for relative links from a dir different from /, something like this:

sftp> pwd
Remote working directory: /dir

Thanks

As I understand sftp, pwd is solely a client feature keeping track of the current directory. So the unit tests can only use /dir/sub/... etc. ...

Yes, you are right, you already included a test like this:

err = p.cli.Symlink("f1", "/subdir/linked_f1")
require.NoError(t, err)

I added such tests to SFTPGo because I have to handle other things there

georgmu commented 1 year ago

One more thing here: I was also testing Readlink() results and they are manipulated as well for absolute links.

From my understanding, a symlink is just a textfile whose content gets interpreted by the virtual file system. So if I write /foo/bar into a textfile I expect the content to be /foo/bar whenever I read it.

chroot'ing does not alter the content ("target") of the link, only the interpretation of the content.

This is how symlinks work and I think the library should do the same: not interpreting the link target.

What a chroot-in-software (what SFTPGo does) should do sanity checks whenever reading a link (like interpreting absolute links as links from the chroot and check that relative links do not escape the chroot), but the library should return them as is.

drakkan commented 1 year ago

One more thing here: I was also testing Readlink() results and they are manipulated as well for absolute links.

From my understanding, a symlink is just a textfile whose content gets interpreted by the virtual file system. So if I write /foo/bar into a textfile I expect the content to be /foo/bar whenever I read it.

chroot'ing does not alter the content ("target") of the link, only the interpretation of the content.

  • lets assume a symlink /subdir/link.txt with link target /foo/bar
  • when reading /subdir/link.txt the content of /foo/bar is returned
  • when in a chroot /subdir, reading /link.txt returns the content of /foo/bar in the chroot, which is effectively /subdir/foo/bar from outside of the chroot

This is how symlinks work and I think the library should do the same: not interpreting the link target.

The library should pass the cleaned path for readlink requests, the request server implementations (SFTPGo or others) can handle the request according to their own logic, which could be the one you describe or a different one

What a chroot-in-software (what SFTPGo does) should do sanity checks whenever reading a link (like interpreting absolute links as links from the chroot and check that relative links do not escape the chroot), but the library should return them as is.

SFTPGo resolves the link and if it is outside the chroot directory it will return an error like this:

"level":"error","time":"2022-09-30T15:56:24.580","sender":"osfs","connection_id":"43bea5101a92edede8edbeebdec1ded65406186c4b875013eed73d1c332de9c5","message":"Invalid path resolution, path \"/tmp/test_home/foo/bar\" original path \"/link.txt\" resolved \"/tmp/test_home/subdir/link.txt\" err: Path resolution error: path \"/tmp/test_home/foo/bar\" is not inside \"/tmp/test_home/subdir\""}
georgmu commented 1 year ago

The library should pass the cleaned path for readlink requests

Currently, it returns a relative path, even if an absolute path is generated.

p.cli.Symlink("/foo", "/bar")
...
p.cli.Readlink("/bar") // <- "foo" instead of "/foo" 

I just checked the code: openssh sftp-server also just returns the content of readlink(2) and ln does not clean the symlink when generating one (ln -s /a/b/../c foo will have exactly /a/b/../c as target instead of cleaned /a/c.

georgmu commented 1 year ago

The problem with ReadLink is here: https://github.com/pkg/sftp/blob/3aa4378d78d53fd5b7fad0b4ad0f0cd61ef2fee3/request.go#L584

file.Name() only returns the base name of the file (as specified in docs for os.FileInfo), but this is wrong here.

Solution would be to depend on something other than os.FileInfo here or return a custom Sys() result.

drakkan commented 1 year ago

ah ok, now I better understand what you mean. In SFTPGo I use this workaround

https://github.com/drakkan/sftpgo/blob/main/internal/sftpd/handler.go#L263

I have custom fileinfo and for the Readlink case I set fullName to true and so Name() returns the full path

https://github.com/drakkan/sftpgo/blob/1e21aa945338dae031fcf49495f2f802fcc0ffd6/internal/vfs/fileinfo.go#L37

I agree we should fix this issue in a better way (I use the same hack also for FTP).

drakkan commented 1 year ago

The library should pass the cleaned path for readlink requests

Currently, it returns a relative path, even if an absolute path is generated.

p.cli.Symlink("/foo", "/bar")
...
p.cli.Readlink("/bar") // <- "foo" instead of "/foo" 

I just checked the code: openssh sftp-server also just returns the content of readlink(2) and ln does not clean the symlink when generating one (ln -s /a/b/../c foo will have exactly /a/b/../c as target instead of cleaned /a/c.

As for the server side path normalization, the library always cleaned paths (I'm not the original author), the only expection is the request.Filepath for symlinks introduced in this PR. I suggest to discuss this in a separate issue/PR, thanks

georgmu commented 1 year ago

I just uploaded a solution for both symlink and readlink stuff with extended test cases.

The readlink fix is a bit hacky and requires some documentation and/or another type.

drakkan commented 1 year ago

I'm not sure about the readlink fix. You have to use a custom fileinfo implementing the new LinkReader interface.

Readlink is currently broken, I think we can break backward compatibility for something that does not work or that works with hacks like the one I used in SFTPGo. @puellanivis what do you think about? (I already know you disagree :smile: )

drakkan commented 1 year ago

@georgmu I'm now using your initial patch in SFTPGo

https://github.com/drakkan/sftpgo/commit/0e8c41bbd1e861d3261852536d408ac88c7eba47#diff-d48bed73cb243b5300108a318eea6494652d40093322e498e5fd5d098725ef33 https://github.com/drakkan/sftp/commit/e8c89afc13a78badebfd48f90b3ccefb5223017f

thanks!

drakkan commented 1 year ago

We could do something like this:

diff --git a/request-interfaces.go b/request-interfaces.go
index 2e5ee6b..ab5f5a1 100644
--- a/request-interfaces.go
+++ b/request-interfaces.go
@@ -94,7 +94,7 @@ type LstatFileLister interface {
 //
 // Up to v1.13.5 the signature for the RealPath method was:
 //
-// RealPath(string) string
+// # RealPath(string) string
 //
 // we have added a legacyRealPathFileLister that implements the old method
 // to ensure that your code does not break.
@@ -104,6 +104,13 @@ type RealPathFileLister interface {
        RealPath(string) (string, error)
 }

+// ReadlinkFileLister is a FileLister that implements the Readlink method.
+// TODO: complete docs
+type ReadlinkFileLister interface {
+       FileLister
+       Readlink(string) (string, error)
+}
+
 // This interface is here for backward compatibility only
 type legacyRealPathFileLister interface {
        FileLister
diff --git a/request.go b/request.go
index 6a7e6cf..4db3f54 100644
--- a/request.go
+++ b/request.go
@@ -295,7 +295,25 @@ func (r *Request) call(handlers Handlers, pkt requestPacket, alloc *allocator, o
                return filecmd(handlers.FileCmd, r, pkt)
        case "List":
                return filelist(handlers.FileList, r, pkt)
-       case "Stat", "Lstat", "Readlink":
+       case "Stat", "Lstat":
+               return filestat(handlers.FileList, r, pkt)
+       case "Readlink":
+               if readlinkFileLister, ok := handlers.FileList.(ReadlinkFileLister); ok {
+                       resolved, err := readlinkFileLister.Readlink(r.Filepath)
+                       if err != nil {
+                               return statusFromError(pkt.id(), err)
+                       }
+                       return &sshFxpNamePacket{
+                               ID: pkt.id(),
+                               NameAttrs: []*sshFxpNameAttr{
+                                       {
+                                               Name:     resolved,
+                                               LongName: resolved,
+                                               Attrs:    emptyFileStat,
+                                       },
+                               },
+                       }
+               }
                return filestat(handlers.FileList, r, pkt)
        default:
                return statusFromError(pkt.id(), fmt.Errorf("unexpected method: %s", r.Method))

@puellanivis, @georgmu do you have other/better suggestions? thanks

drakkan commented 1 year ago

I try to summarize:

type ReadlinkFileLister interface {
    FileLister
    Readlink(string) (string, error)
}

we'll break backward compatibility in v2.

If we all agree with this approach we can add the ReadlinkFileLister interface.

drakkan commented 1 year ago

@georgmu are you interested to add the ReadlinkFileLister interface as discussed? I'm busy with many other things and I don't care about the authorship of the code pasted above. Thanks

georgmu commented 1 year ago

@georgmu are you interested to add the ReadlinkFileLister interface as discussed? I'm busy with many other things and I don't care about the authorship of the code pasted above. Thanks

Sure. I will prepare the commits.

georgmu commented 1 year ago

I prepared a commit for the ReadlinkFileLister. I am not the best when it comes to documentation. If I should extend the notes, please just give me a hint on what to improve.

drakkan commented 1 year ago

I prepared a commit for the ReadlinkFileLister. I am not the best when it comes to documentation. If I should extend the notes, please just give me a hint on what to improve.

Thanks! You are much better than me at writing docs :smile: The PR LGTM. Maybe you can also test returning an error from a readlink request. There is no coverage:

Schermata del 2022-10-12 12-28-28