rclone / rclone

"rsync for cloud storage" - Google Drive, S3, Dropbox, Backblaze B2, One Drive, Swift, Hubic, Wasabi, Google Cloud Storage, Azure Blob, Azure Files, Yandex Files
https://rclone.org
MIT License
47.49k stars 4.24k forks source link

rclone serve restic incorrectly handles paths for repository directory named "data" #6086

Open ttyusupov opened 2 years ago

ttyusupov commented 2 years ago

What is the problem you are having with rclone?

Using rclone with google drive as a backend for https://github.com/restic/restic, trying to init repository and then check it and getting "Fatal: wrong password or no key found".

Using verbose rclone logging I was able to identify that issue is on rclone side.

What is your rclone version (output from rclone version)

rclone v1.58.0
- os/version: centos 7.8.2003 (64 bit)
- os/kernel: 3.10.0-1127.el7.x86_64 (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.17.8
- go/linking: static
- go/tags: none

Which OS you are using and how many bits (e.g. Windows 7, 64 bit)

CentOS Linux release 7.8.2003 (Core)

Which cloud storage system are you using? (e.g. Google Drive)

Google Drive

The command you were trying to run (e.g. rclone copy /tmp remote:tmp)

rclone serve restic -vvv gdrive:tmp --addr localhost:8882
restic --repo rest:http://localhost:8882/data --password-file ../restic.txt init
restic --repo rest:http://localhost:8882/data --password-file ../restic.txt check

A log from the command with the -vv flag (e.g. output from rclone -vv copy /tmp remote:tmp)

restic:

using temporary cache in /tmp/restic-check-cache-3698028368
Fatal: wrong password or no key found

rclone:

2022/04/05 20:35:42 DEBUG : rclone: Version "v1.58.0" starting with parameters ["rclone" "serve" "restic" "-vvv" "gdrive:tmp" "--addr" "localhost:8882"]
2022/04/05 20:35:42 DEBUG : Creating backend with remote "gdrive:tmp"
2022/04/05 20:35:42 DEBUG : Using config file from "/home/timur/.config/rclone/rclone.conf"
2022/04/05 20:35:42 DEBUG : Google drive root 'tmp': 'root_folder_id = 0AFT6_RbsgWCtUk9PVA' - save this in the config to speed up startup
2022/04/05 20:35:43 NOTICE: Google drive root 'tmp': Serving restic REST API on http://localhost:8882/
2022/04/05 20:35:43 DEBUG : Google drive root 'tmp': HEAD /data/config
2022/04/05 20:35:44 DEBUG : Google drive root 'tmp': GET /data/keys/
2022/04/05 20:35:44 DEBUG : data/ke/keys: list request

As you can see rclone transforms "GET /data/keys/" HTTP request to listObject call with remote = "data/ke/keys", while it should be remote = "data/keys".

How to use GitHub

ttyusupov commented 2 years ago

I think the root cause is inside makeRemote function: https://github.com/rclone/rclone/blob/c968c3e41cf71896f50f6b4a5c3cc5ffd5e7fa35/cmd/serve/restic/restic.go#L206

It thinks data substring of the path is restic data directory name, but in reality this is restic repository directory name which has the same data name.

ncw commented 2 years ago

I think this is because rclone will work both with and without a repository name, and unfortunately /data is one of the paths used in the API.

Does it work ok if you use a different name, eg Data?

ttyusupov commented 2 years ago

Yes, it works with other names (tried Data, dat , data2).

ncw commented 2 years ago

I can't think of an easy way to solve this.

ttyusupov commented 2 years ago

Looks like the only purpose of makeRemote function is to map data/2159dd48 to data/21/2159dd48 and it uses regexp

var matchData = regexp.MustCompile("(?:^|/)data/([^/]{2,})$")

We need to handle the following cases: 1) No repository name. We need to map /data/2159dd48 to data/21/2159dd48. 2) Repository name is not data. We need to map <repo-name>/data/2159dd48 to <repo-name>/data/21/2159dd48. 3) Repository name is data. We need to map data/data/2159dd48 to data/data/21/2159dd48, but don't map data/*, for example /data/keys should be kept as is.

Not sure how we can distinguish /data/<dirname> (where data is repository name) from /data/abcdabcd (where data is data directory name and we have no repository name) without additional info.

Looks like options are: a) Add a flag like you suggested b) Update docs to mention that data repository name (index, keys, etc should be OK) is not supported by rclone serve restic. And also return a clear error on attempts co create / access repository with name data (this can be detected by requests to /data/config).

c) Use additional info - on request to /data/<dirname> check if /data/config file is present (presence could be cached) and if this is the case, it means /data/ is a repository root rather than data directory. But it will be tricky do properly invalidate cache (repository could be removed at remote side by other accessors and then endpoint could be used without repository name).

Seems a) and b) are the easiest options.