pkg / sftp

SFTP support for the go.crypto/ssh package
BSD 2-Clause "Simplified" License
1.5k stars 380 forks source link

Is file modification time preservation working properly? #481

Closed vansante closed 2 years ago

vansante commented 2 years ago

As a user of this very nice library I am getting some bug reports from some of my users that file modification timestamps are not getting preserved when copying/uploading a file over SFTP.

The modification time is supposed to be set with a setstat call.

I did some digging around and came to the following conclusions:

I did some deeper digging with logging and noticed that when I upload the file with Filezilla the setstat call comes in after the file has been opened, data was written and the file handle was closed again. However, for the non functioning CLI utils, the order is:

Which of course has the effect of the modification timestamps in the setstat call being immediately overwritten.

At this point I tested the posix behaviour of this with the following simple test:

func TestModificationTime(*testing.T) {
    f, _ := os.Create("testfile.txt")
    os.Chtimes(f.Name(), time.Now(), time.Unix(824522342, 0))
    f.WriteAt([]byte("sefesfse"), 10)
    f.Close()
}

Which resulted in the modification time being time.Now() instead of time.Unix(824522342, 0) as I suspected it would.

This led me to believe this is probably not an issue in my own code as I'm just executing in the order of the commands I am getting. To verify this I started looking at the behaviour of drakkan/sftpgo by firing up the latest docker container: docker run --name some-sftpgo -p 8080:8080 -p 2022:2022 -d "drakkan/sftpgo:latest"

I created a basic test user on local storage and uploaded a file (note the -p flags which are supposed to preserve timestamps):

pvansanten@host  sftp -p -P 2022 test@127.0.0.1:
test@127.0.0.1's password: 
Connected to 127.0.0.1.
Changing to: /.
sftp> PUT -p /home/pvansanten/Pictures/dockergopher.png
Uploading /home/pvansanten/Pictures/dockergopher.png to /dockergopher.png
/home/pvansanten/Pictures/dockergopher.png                                                                                                               100%  396KB  41.2MB/s   00:00    
sftp> ls -al
-rw-r--r--    1 1000     1000       405692 Nov 24 09:41 dockergopher.png

and again, the timestamps are not being preserved, which led me to here, making this issue.

This might very well just be due to the the sftp and sshfs utils not sending the setstat commands in the proper order, but then the preservation of modification timestamps would be broken much wider, which I doubt a bit. So possibly there is some nuance broken in this library. Hope you can help!

puellanivis commented 2 years ago

Unfortunately, I fear that if the test code you presented (reproduced below for reference) does the wrong thing, I’m not sure how we would be able to fix any of the issues you’re seeing. This would seem to be an issue with Go, rather than specifically this package.

func TestModificationTime(*testing.T) {
    f, _ := os.Create("testfile.txt")
    os.Chtimes(f.Name(), time.Now(), time.Unix(824522342, 0))
    f.WriteAt([]byte("sefesfse"), 10)
    f.Close()
}

I’ll try and take a look when I can. This sounds weird, and as noted, I think it’s something we cannot address, so we should figure that out sooner rather than later, so it can get passed on to the relevant people quickly.

drakkan commented 2 years ago

Hi,

regarding SFTPGo, this code is wrong. If I remove the returns it works as expected since the sftp CLI sends all the attributes in a single request. I'll fix it later today. Thank your for noticing

drakkan commented 2 years ago

If you try:

docker run --name some-sftpgo -p 8080:8080 -p 2022:2022 -d "drakkan/sftpgo:edge"

or any other edge variant, it should now work as expected

vansante commented 2 years ago

@drakkan I see how this would fix the early return issue where the modification time is not even being set, but I fail to see how that addresses the other issue mentioned, which is that the setstat comes in before the file is actually being written to, and the write thus resets the mod time again.

drakkan commented 2 years ago

@vansante I just fixed an SFTPGo bug. It seems to work for me now:

sftp> PUT -p ros.txt 
Uploading ros.txt to /ros.txt
ros.txt                                                    100% 4034     1.0MB/s   00:00    
sftp> ls -la
drwxr-xr-x    1 0        0               0 Nov 24 13:14 .
-rw-r--r--    1 1000     1000         4034 Jul 18 10:56 ros.txt

I didn't analyzed yet the other issues you reported

vansante commented 2 years ago

I guess the behaviour is dependent on the storage backend used in this case? Because then your results for the mentioned func TestModificationTime(*testing.T) code would differ as well right? Unless I'm missing something completely here :grin:

drakkan commented 2 years ago

This is strange:

nicola@p1 /tmp $ go run main.go 
nicola@p1 /tmp $ ls -la testfile.txt 
-rw-r--r-- 1 nicola nicola 18 24 nov 14.36 testfile.txt

but via SFTP it works, I have no time to investigate further now, sorry

drakkan commented 2 years ago

Hi,

I tryed to analyze this. In SFTPGo it now works with the default configuration but I honestly don't understand why. I haven't done a thorough analysis as there are edge cases where it doesn't work, see below.

In SFTPGo it doesn't work in the following cases:

To support this feature with all configurations and for all the vfs where Chtimes is supported if a Chtimes succeeded for an open handle I'll save the requested values as transfer properties and I reset them when the upload finish. This is not ideal but it should generally work. You could do the same in your SFTP server.

Out of curiosity what features do you need that aren't supported in SFTPGo?

vansante commented 2 years ago

To support this feature with all configurations and for all the vfs where Chtimes is supported if a Chtimes succeeded for an open handle I'll save the requested values as transfer properties and I reset them when the upload finish.

Yes, I figured that would be the inevitable fix, but I'm still wondering how it works for SFTPGo without thjis workaround in most cases. I am also wondering how this would work for the default linux SSH implementation of SFTP as that probably works just fine. This workaround is a bit annoying to implement, as I do not have a central list of open file handlers currently.

Out of curiosity what features do you need that aren't supported in SFTPGo?

I am working on a large filesharing platform, of which one of the supported protocols is SFTP, so most of the SFTPGo features would be mostly in the way as we have our own user management and everything that comes with that. We really need a barebones implementation and pkg/sftp is just that.

drakkan commented 2 years ago

Also this example works as expected for me

vansante commented 2 years ago

The example you are linking to does not use the request-server though, does it? There might be a considerable difference in the codepaths followed.

I did some digging in the source code, and I might be very wrong, but in the request-server.go in the packetWorker() function, I see the sshFxpFsetstatPacket being handled, but no sign of the sshFxpSetstatPacket being handled there.

The difference being between the 2 packets, according to the RFC:

File attributes may be modified using the SSH_FXP_SETSTAT and SSH_FXP_FSETSTAT requests. These requests are used for operations such as changing the ownership, permissions or access times, as well as for truncating a file.

The SSH_FXP_FSETSTAT request modifies the attributes of a file which is already open.

This seems to be somewhat what I'm seeing in behaviour. Is this packet being handled elsewhere and am I missing it or could this be the reason?

drakkan commented 2 years ago

sshFxpSetstatPacket should be handled here because it implements the hasPath interface

vansante commented 2 years ago

You are right, my bad!

drakkan commented 2 years ago

The example you are linking to does not use the request-server though, does it? There might be a considerable difference in the codepaths followed.

yes, the example use the server implementation and works as expected. The request-server based examples don't use the filesystem as storage. SFTPGo, that uses the request-server, also worked even before I added specific code to handle all the possibile use cases (it can also preserve timestamps for cloud based backends via a plugin now).

Can you provide a minimal request-server based implementation that use the local filesystem? We could try to understand what happen comparing that one with the server implementation. I wanted to do this minimal implementation myself but I have little time and so many things to do and this seems to work as expected in some way

vansante commented 2 years ago

Good suggestion! I made a very small example implementation here, I am going to do some testing with that to figure out if the modification time works with that for all clients.

vansante commented 2 years ago

Allright, I think I traced my issue back to the fact that I alter the modification time each time a write for a file comes in. The setstat calls seem to be coming in while the file is still being written to. If I change my timestamping to only change the modified time when a file is opened, it seems to work.

Closing this issue now, thanks for all the help! :smile:

puellanivis commented 2 years ago

Glad to hear you were able to figure out what was wrong!