Open awaken opened 3 years ago
Hm… yeah, that does seem like a confusing error to be returning. Looking into the Go source code for os.Remove
https://cs.opensource.google/go/go/+/refs/tags/go1.17:src/os/file_unix.go;l=314-322
🤔 Hm… curious. Not sure if your proposed change would actually fix things for all cases, or just for this one case. I’ll be on a plane flying back to Germany in 24-hours, so I’ll ponder on things and try and get something going when I get back.
Alternatively, I’m supposed to be working on reimplementing https://github.com/pkg/sftp/pull/432 (there were so many merge conflicts rebasing the PR, I’ve moved on to reimplementing) and this seems like something that would be good to address in that larger refactor, rather than giving too much priority into a confusing error message. 🤔
P.S.: Do you happen to know which server implementation is on the other end when you’re getting this error message? Ideally, we could address this a little better, if we knew what was generating the error.
Hi, that's a server from one of our clients. I can only provide the motd: SSH-2.0-mod_sftp/0.9.9
I forgot to mention that the "Not a directory" error messages (that I experienced) is always of SSH_FX_FAILURE kind.
Hi, that's a server from one of our clients. I can only provide the motd: SSH-2.0-mod_sftp/0.9.9
This is ProFTPD
Hm… yeah, that does seem like a confusing error to be returning. Looking into the Go source code for
os.Remove
https://cs.opensource.google/go/go/+/refs/tags/go1.17:src/os/file_unix.go;l=314-322🤔 Hm… curious. Not sure if your proposed change would actually fix things for all cases, or just for this one case. I’ll be on a plane flying back to Germany in 24-hours, so I’ll ponder on things and try and get something going when I get back.
Let's say some SFTP user can properly read a file, but he doesn't have any writing permission.
In this case, any Client.Remove
attempt will produce a "Not a directory" error.
In my opinion, it is more than just a wrong message, because the error is claiming an incorrect reason about the removal failure. Right now, my application can notify wrong failure reasons.
It seems to me that my proposed fix acts just like os.Remove
, about the logic for returning the error.
If the file removal attempt fails because the file is a directory, then return the error obtained from the directory removal attempt, otherwise just return the error from the file removal attempt.
That's the idea, right?
The problem here is that the os.Remove
only returns the syscall.Unlink
error if the syscall.Rmdir
is ENOTDIR
. For instance, if the filesystem object exists, is a directory, and is not empty, os.Remove
will return “directory not empty” not “is a directory”.
The change you have proposed would return “is a directory” for a non-empty directory, with an error message that is now just as confusing as returning not a directory
for a file in a directory without write permission. (Note: a user without write permission to a file can unlink()
that file as long as they have write permission to the directory containing it. This is a known security loophole, where someone can copy a file, and recreate it with write permissions (thus escalating permission access to that file) as long as they have write permission to the directory itself.)
So, I’m starting to think that matching the implementation used by os.Remove
is the cause of this issue. Thinking about why os.Remove
exists as both file and directory deletion, I figured it might be because Plan 9 only offers a single syscall that defines this behavior, and looking up the source code. Yes, this is absolutely the case. Getting Plan 9 to implement a POSIX unlink
and rmdir
would have taken a lot more effort than simply combining both for the POSIX implementations.
The windows implementation does a Stat
on the file if both failed, which it then uses to figure out if it is a directory, so return RemoveDirectory
error; or a file, so return the DeleteFile
error.
We could potentially do something similar, but this would be another roundtrip request. Although, this would save us from having to parse out the error string to figure out which ones we shouldn’t return. (i.e. something like returning the SSH_FXP_REMOVE
error unless the SSH_FXP_RMDIR
error is not the string “not a directory”, which is obviously fragile, since it might be something like ”ist kein Verzeichnis” or other such localized string.)
We should be able to test for the possibility of a SSH_FX_NOT_A_DIRECTORY
and if the error is not that, then we Stat
the file to determine if directory or file, (and as a fail-safe, return the SSH_FXP_REMOVE
response if that Stat
fails).
I’ll work this into my client updates, when I get into it again.
does a
Stat
on the file if both failed, which it then uses to figure out if it is a directory, so returnRemoveDirectory
error; or a file, so return theDeleteFile
error.
This seems a good and easy solution!
Hi, Great job! I really appreciate your effort for maintaining this library! I mainly use your library inside a proprietary Big Data project, that is running in production since 2 years ago, using Linux VMs in the cloud, and moves many TBs of data per day.
func (c *Client) Remove(path string) error
can return an error for many reasons, of course. In most cases, the library just works fine.If
Remove
is called for removing just a file (and not a directory), in some cases I experienced thatRemove
can return an error containing this message: "Not a directory". In my opinion, it is a bug.Looking at the
Remove
implementation, I see that, in caseerr.Code
is equal tosshFxFailure
or if there is a permission error, the returned error is the one got fromRemoveDirectory
. I think ifRemove
is called for removing a file, it should never return an error with a message like "Not a directory".I would suggest this fixed implementation:
Thanks for your attention.