Open Robert-Pitt opened 1 year ago
I'm not sure whether anyone so far has used that ssh combination ^^ . To get a debug log, you can use the release binaries. Just set the environment variable DEBUG_LOG=debug.log
.
The OpenFile
call issued by restic doesn't use any exotic parameters. I'd expect them to also work on Windows.
Could you try whether creating a restic repository locally and then copying it to the SFTP server allow other commands to work? Which files and folders did restic init
create before the sftp error?
I tried creating a repo and uploading it and it failed somewhere else, lock files if I recall correctly. restic init made quite a lot of folders. I no longer need this fixed personally as I'm using rclone as a REST server now and it works fine. Would you like me to make a list of the files init makes and attach them? I can do that if you want.
I tried creating a repo and uploading it and it failed somewhere else, lock files if I recall correctly. restic init made quite a lot of folders
That means that the backend is able to create directories, but fails to write any file (the key is the first file that's written during repository initialization).
I wonder whether the following patch might solve the problem. According to https://stackoverflow.com/questions/39395340/error-bad-message-when-accessing-a-file-on-windows-openssh-sftp-server opening a file in append-only mode (the current behavior of the sftp backend) is not supported by the windows openssh.
@Robert-Pitt Could you test the following patch? Feel free to ask if you need help with building restic.
diff --git a/internal/backend/sftp/sftp.go b/internal/backend/sftp/sftp.go
index 12c355003..b83df0c9c 100644
--- a/internal/backend/sftp/sftp.go
+++ b/internal/backend/sftp/sftp.go
@@ -306,7 +306,7 @@ func (r *SFTP) Save(_ context.Context, h restic.Handle, rd restic.RewindReader)
dirname := r.Dirname(h)
// create new file
- f, err := r.c.OpenFile(tmpFilename, os.O_CREATE|os.O_EXCL|os.O_WRONLY)
+ f, err := r.c.OpenFile(tmpFilename, os.O_CREATE|os.O_EXCL|os.O_RDWR)
if r.IsNotExist(err) {
// error is caused by a missing directory, try to create it
@@ -315,7 +315,7 @@ func (r *SFTP) Save(_ context.Context, h restic.Handle, rd restic.RewindReader)
debug.Log("error creating dir %v: %v", r.Dirname(h), mkdirErr)
} else {
// try again
- f, err = r.c.OpenFile(tmpFilename, os.O_CREATE|os.O_EXCL|os.O_WRONLY)
+ f, err = r.c.OpenFile(tmpFilename, os.O_CREATE|os.O_EXCL|os.O_RDWR)
}
}
Same issue here. @MichaelEischer I tried your suggested patch to no avail.
I found that it's not the OpenFile() call that yields the SSH_FX_BAD_MESSAGE, but instead the subsequent Chmod(). (Error handling is a bit unfortunate here.)
So the following patch 'fixes' the issue:
diff --git a/internal/backend/sftp/sftp.go b/internal/backend/sftp/sftp.go
index 0a94e4aa3..96283671c 100644
--- a/internal/backend/sftp/sftp.go
+++ b/internal/backend/sftp/sftp.go
@@ -335,12 +335,6 @@ func (r *SFTP) Save(_ context.Context, h backend.Handle, rd backend.RewindReader
f, err = r.c.OpenFile(tmpFilename, os.O_CREATE|os.O_EXCL|os.O_WRONLY)
}
}
-
- // pkg/sftp doesn't allow creating with a mode.
- // Chmod while the file is still empty.
- if err == nil {
- err = f.Chmod(r.Modes.File)
- }
if err != nil {
return errors.Wrap(err, "OpenFile")
}
I have not been able to reproduce the issue when manually issuing SFTP commands against the Windows OpenSSH server. I.e. a chmod 0600 myfile
succeeds w/o error (but does not appear to actually change the mode). So the root cause remains unknown.
Anyway, IMO changing the mode on Windows is rather pointless as files by default inherit their folder's ACLs.
The pragmatic way to solve this might be adding a (per-backend?) parameter that disables setting the mode. (Or allowing to overwrite the default mode, and when set to 0
, the Chmods would be skipped.)
@MichaelEischer you seemed to be interested in this issue. Maybe you just overlooked my previous comment including some findings and suggestions.
Thanks for investigating the issue. I'd really prefer if we can find a way that does not introduce a special case here. But I'm afraid I cannot help much with debugging the issue here.
I have not been able to reproduce the issue when manually issuing SFTP commands against the Windows OpenSSH server. I.e. a
chmod 0600 myfile
succeeds w/o error (but does not appear to actually change the mode). So the root cause remains unknown.
That is a pretty strong indication that the issue can be solved without introducing some random option that users have to set manually.
I had the same error message in my environment (Windows Sever). Because I setup the following setting for sftp in Windows host. Especially ChrootDirectoy. when I setup it, the system always show this error message.(sftp: "Bad message" (SSH_FX_BAD_MESSAGE)) I marked ChrootDirectoy, the restic client can not show this message. In my opinion, ChrootDirectoy limit user's root folder (/) in D:\restic\ and can not change folder to broswe other folders. I try to add ChrootDirectory and create nas\ folder within it, but the restic still show this error message. The user had rwx permission in nas\ folder.
Match User WindowssFtpUser
AllowTcpForwarding no
X11Forwarding no
# ChrootDirectory D:\restic\
ForceCommand internal-sftp -d nas\
# it is my docker setting for sftp
- RESTIC_REPOSITORY=sftp:image@127.0.0.1:/D:/restic/nas/ # Backup folder in Windows backup server
@ppkliu You're saying that there are no sftp error without ChrootDirectory
? And that the errors show up when using ChrootDirectory
? Can you post the full error message you get?
@MichaelEischer I have seen the same as @ppkliu stated above. Using a Windows 11 host with their OpenSSH server as an SFTP endpoint for restic. Witch ChrootDirectory
set I kept getting an error trying to create the repo.
Fatal: create key in repository at sftp:user@server.local:/test-repo failed: OpenFile: sftp: "Bad message" (SSH_FX_BAD_MESSAGE)
I removed the ChrootDirectory
option and was able to successfully create the repo in same location I was trying to create it at before (just using a full path now).
I did some digging:
The error looks very much like a bug in the homegrown chroot implementation used in the Windows openssh sftp port. In https://github.com/PowerShell/openssh-portable/blob/661803c9ec4d7dee6574eb6ff0c85b2b7006edb1/contrib/win32/win32compat/w32fd.c#L1013 it first retrieves the filepath for the handle (the real path on the windows filesystem) and passes it to w32_chmod which applies the chroot a second time!
That ultimately results in a call to _wchmod with a broken file path. This triggers an EINVAL error that gets translated to the "Bad message" error.
I have not been able to reproduce the issue when manually issuing SFTP commands against the Windows OpenSSH server. I.e. a
chmod 0600 myfile
succeeds w/o error (but does not appear to actually change the mode). So the root cause remains unknown.
That uses chmod
as opposed to fchmod
used by restic. Using the sftp client, it should be possible to trigger the same bug using put -p filename
.
Good catch @MichaelEischer. Thank you for taking the time to track this down. I took the liberty of reporting the issue at Win32-OpenSSH and quoting you there (I hope that's ok for you): https://github.com/PowerShell/Win32-OpenSSH/issues/2263
Output of
restic version
restic 0.15.2 compiled with go1.20.3 on linux/amd64
How did you run restic exactly?
What backend/server/service did you use to store the repository?
OpenSSH_for_Windows_9.2, LibreSSL 3.7.2
Expected behavior
Successfully create a new repository in the (chroot) / directory of the SSH host defined by backup-to-laptop-direct. I've also tried using a sub-directory and the same behaviour happens.
Actual behavior
Many filesystem objects are created but not a complete repository.
Steps to reproduce the behavior
I am using a Windows 10 host with the latest OpenSSH sshd installed using winget and restic is running on the latest LTS Ubuntu Linux but has been updated with the self update command (trying to fix this issue).
Do you have any idea what may have caused this?
My best guess would be the filename it wants to make is too long for the NTFS partition but this is a wild guess and you probably thought of that already when designing it.
Do you have an idea how to solve the issue?
Not a clue. I will try compiling restic from source and enabling debug logs if you don't know what's wrong.
Did restic help you today? Did it make you happy in any way?
Yes it backs up my Windows system lovely 👍