iterative / dvc-webdav

Webdav plugin for dvc
Apache License 2.0
1 stars 2 forks source link

push: cannot upload to webdav server #13

Open robince opened 2 years ago

robince commented 2 years ago

Bug Report

push: cannot upload to webdav server

Description

Cannot push to a webdavs remote: "failed to transfer"

Reproduce

Example:

  1. git init
  2. dvc init
  3. echo abcdefg > data
  4. dvc add data
  5. dvc remote add -d webdav webdavs://url/data
  6. dvc remote modify webdav user dvc
  7. dvc remote modify webdav password XXX
  8. dvc push -v

Expected

I expected the file to be uploaded to the remote folder. What happened was the top level folder "02" is created on the server (so no issue with disk write permissions on the server). But no data is uploaded.

Webdav server is working fine with other clients.

Environment information

Output of dvc doctor:

$ dvc doctor
DVC version: 2.11.0 (brew)
---------------------------------
Platform: Python 3.9.13 on macOS-12.4-arm64-arm-64bit
Supports:
    azure (adlfs = 2022.4.0, knack = 0.9.0, azure-identity = 1.10.0),
    gdrive (pydrive2 = 1.10.1),
    gs (gcsfs = 2022.5.0),
    webhdfs (fsspec = 2022.5.0),
    http (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
    https (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
    s3 (s3fs = 2022.5.0, boto3 = 1.21.21),
    ssh (sshfs = 2022.6.0),
    oss (ossfs = 2021.8.0),
    webdav (webdav4 = 0.9.7),
    webdavs (webdav4 = 0.9.7)
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk3s1s1
Caches: local
Remotes: webdavs
Workspace directory: apfs on /dev/disk3s1s1
Repo: dvc, git

Server: Centos7, Apache/2.4.6 (CentOS)

Additional Information (if any):

Verbose output on first run (destination directory empty):

2022-06-12 07:40:25,328 DEBUG: Preparing to transfer data from '/Users/robince/tmp/testdvc/.dvc/cache' to ''
2022-06-12 07:40:25,328 DEBUG: Preparing to collect status from ''
2022-06-12 07:40:25,328 DEBUG: Collecting status from ''
2022-06-12 07:40:25,328 DEBUG: Querying 1 hashes via object_exists
2022-06-12 07:40:25,598 DEBUG: Preparing to collect status from '/Users/robince/tmp/testdvc/.dvc/cache'
2022-06-12 07:40:25,598 DEBUG: Collecting status from '/Users/robince/tmp/testdvc/.dvc/cache'
2022-06-12 07:40:25,800 DEBUG: failed to transfer 'md5: 020861c8c3fe177da19a7e9539a5dbac'
2022-06-12 07:40:25,802 ERROR: failed to push data to the cloud - 1 files failed to upload
------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/homebrew/Cellar/dvc/2.11.0/libexec/lib/python3.9/site-packages/dvc/data_cloud.py", line 88, in transfer
    return transfer(src_odb, dest_odb, objs, **kwargs)
  File "/opt/homebrew/Cellar/dvc/2.11.0/libexec/lib/python3.9/site-packages/dvc_data/transfer.py", line 185, in transfer
    _do_transfer(
  File "/opt/homebrew/Cellar/dvc/2.11.0/libexec/lib/python3.9/site-packages/dvc_data/transfer.py", line 120, in _do_transfer
    raise TransferError(total_fails)
dvc_data.transfer.TransferError: 1 transfer failed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/homebrew/Cellar/dvc/2.11.0/libexec/lib/python3.9/site-packages/dvc/repo/push.py", line 68, in push
    pushed += self.cloud.push(
  File "/opt/homebrew/Cellar/dvc/2.11.0/libexec/lib/python3.9/site-packages/dvc/data_cloud.py", line 109, in push
    return self.transfer(
  File "/opt/homebrew/Cellar/dvc/2.11.0/libexec/lib/python3.9/site-packages/dvc/data_cloud.py", line 90, in transfer
    raise FileTransferError(exc.fails) from exc
dvc.exceptions.FileTransferError: 1 files failed to transfer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/homebrew/Cellar/dvc/2.11.0/libexec/lib/python3.9/site-packages/dvc/commands/data_sync.py", line 58, in run
    processed_files_count = self.repo.push(
  File "/opt/homebrew/Cellar/dvc/2.11.0/libexec/lib/python3.9/site-packages/dvc/repo/__init__.py", line 49, in wrapper
    return f(repo, *args, **kwargs)
  File "/opt/homebrew/Cellar/dvc/2.11.0/libexec/lib/python3.9/site-packages/dvc/repo/push.py", line 72, in push
    raise UploadError(exc.amount)
dvc.exceptions.UploadError: 1 files failed to upload
------------------------------------------------------------
2022-06-12 07:40:25,805 DEBUG: Analytics is enabled.
2022-06-12 07:40:25,840 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/var/folders/2p/03gtrn8s52v1ggbdgxdl5t2w0000gs/T/tmp7iso6jbf']'
2022-06-12 07:40:25,842 DEBUG: Spawned '['daemon', '-q', 'analytics', '/var/folders/2p/03gtrn8s52v1ggbdgxdl5t2w0000gs/T/tmp7iso6jbf']'

Server access log corresponding to the above. There are no server errors.

109.249.184.189 - dvc [12/Jun/2022:07:40:25 +0100] "PROPFIND /dav/dvc/02/0861c8c3fe177da19a7e9539a5dbac HTTP/1.1" 404 239 "-" "python-httpx/0.23.0"
109.249.184.189 - dvc [12/Jun/2022:07:40:25 +0100] "PROPFIND /dav/dvc/02/0861c8c3fe177da19a7e9539a5dbac HTTP/1.1" 404 239
109.249.184.189 - dvc [12/Jun/2022:07:40:25 +0100] "PROPFIND /dav/dvc/02/0861c8c3fe177da19a7e9539a5dbac HTTP/1.1" 404 239 "-" "python-httpx/0.23.0"
109.249.184.189 - dvc [12/Jun/2022:07:40:25 +0100] "PROPFIND /dav/dvc/02/0861c8c3fe177da19a7e9539a5dbac HTTP/1.1" 404 239
109.249.184.189 - dvc [12/Jun/2022:07:40:25 +0100] "MKCOL /dav/dvc/02/ HTTP/1.1" 201 186 "-" "python-httpx/0.23.0"
109.249.184.189 - dvc [12/Jun/2022:07:40:25 +0100] "MKCOL /dav/dvc/02/ HTTP/1.1" 201 186
109.249.184.189 - dvc [12/Jun/2022:07:40:25 +0100] "MKCOL /dav/dvc/02/ HTTP/1.1" 405 234 "-" "python-httpx/0.23.0"
109.249.184.189 - dvc [12/Jun/2022:07:40:25 +0100] "MKCOL /dav/dvc/02/ HTTP/1.1" 405 234
109.249.184.189 - dvc [12/Jun/2022:07:40:25 +0100] "PROPFIND /dav/dvc/02 HTTP/1.1" 301 243 "-" "python-httpx/0.23.0"
109.249.184.189 - dvc [12/Jun/2022:07:40:25 +0100] "PROPFIND /dav/dvc/02 HTTP/1.1" 301 243

Folder 02 is created after runing the first push. Then on a second push:

2022-06-12 07:44:37,552 DEBUG: Preparing to transfer data from '/Users/robince/tmp/testdvc/.dvc/cache' to ''
2022-06-12 07:44:37,553 DEBUG: Preparing to collect status from ''
2022-06-12 07:44:37,553 DEBUG: Collecting status from ''
2022-06-12 07:44:37,553 DEBUG: Querying 1 hashes via object_exists
2022-06-12 07:44:37,800 DEBUG: Preparing to collect status from '/Users/robince/tmp/testdvc/.dvc/cache'
2022-06-12 07:44:37,800 DEBUG: Collecting status from '/Users/robince/tmp/testdvc/.dvc/cache'
2022-06-12 07:44:37,957 DEBUG: failed to transfer 'md5: 020861c8c3fe177da19a7e9539a5dbac'
2022-06-12 07:44:37,958 ERROR: failed to push data to the cloud - 1 files failed to upload
------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/homebrew/Cellar/dvc/2.11.0/libexec/lib/python3.9/site-packages/dvc/data_cloud.py", line 88, in transfer
    return transfer(src_odb, dest_odb, objs, **kwargs)
  File "/opt/homebrew/Cellar/dvc/2.11.0/libexec/lib/python3.9/site-packages/dvc_data/transfer.py", line 185, in transfer
    _do_transfer(
  File "/opt/homebrew/Cellar/dvc/2.11.0/libexec/lib/python3.9/site-packages/dvc_data/transfer.py", line 120, in _do_transfer
    raise TransferError(total_fails)
dvc_data.transfer.TransferError: 1 transfer failed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/homebrew/Cellar/dvc/2.11.0/libexec/lib/python3.9/site-packages/dvc/repo/push.py", line 68, in push
    pushed += self.cloud.push(
  File "/opt/homebrew/Cellar/dvc/2.11.0/libexec/lib/python3.9/site-packages/dvc/data_cloud.py", line 109, in push
    return self.transfer(
  File "/opt/homebrew/Cellar/dvc/2.11.0/libexec/lib/python3.9/site-packages/dvc/data_cloud.py", line 90, in transfer
    raise FileTransferError(exc.fails) from exc
dvc.exceptions.FileTransferError: 1 files failed to transfer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/homebrew/Cellar/dvc/2.11.0/libexec/lib/python3.9/site-packages/dvc/commands/data_sync.py", line 58, in run
    processed_files_count = self.repo.push(
  File "/opt/homebrew/Cellar/dvc/2.11.0/libexec/lib/python3.9/site-packages/dvc/repo/__init__.py", line 49, in wrapper
    return f(repo, *args, **kwargs)
  File "/opt/homebrew/Cellar/dvc/2.11.0/libexec/lib/python3.9/site-packages/dvc/repo/push.py", line 72, in push
    raise UploadError(exc.amount)
dvc.exceptions.UploadError: 1 files failed to upload
------------------------------------------------------------
2022-06-12 07:44:37,961 DEBUG: Analytics is enabled.
2022-06-12 07:44:37,994 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/var/folders/2p/03gtrn8s52v1ggbdgxdl5t2w0000gs/T/tmp815zv0z8']'
2022-06-12 07:44:37,996 DEBUG: Spawned '['daemon', '-q', 'analytics', '/var/folders/2p/03gtrn8s52v1ggbdgxdl5t2w0000gs/T/tmp815zv0z8']'
109.249.184.189 - dvc [12/Jun/2022:07:44:37 +0100] "PROPFIND /dav/dvc/02/0861c8c3fe177da19a7e9539a5dbac HTTP/1.1" 404 239 "-" "python-httpx/0.23.0"
109.249.184.189 - dvc [12/Jun/2022:07:44:37 +0100] "PROPFIND /dav/dvc/02/0861c8c3fe177da19a7e9539a5dbac HTTP/1.1" 404 239
109.249.184.189 - dvc [12/Jun/2022:07:44:37 +0100] "PROPFIND /dav/dvc/02/0861c8c3fe177da19a7e9539a5dbac HTTP/1.1" 404 239 "-" "python-httpx/0.23.0"
109.249.184.189 - dvc [12/Jun/2022:07:44:37 +0100] "PROPFIND /dav/dvc/02/0861c8c3fe177da19a7e9539a5dbac HTTP/1.1" 404 239
109.249.184.189 - dvc [12/Jun/2022:07:44:38 +0100] "MKCOL /dav/dvc/02/ HTTP/1.1" 405 234 "-" "python-httpx/0.23.0"
109.249.184.189 - dvc [12/Jun/2022:07:44:38 +0100] "MKCOL /dav/dvc/02/ HTTP/1.1" 405 234
109.249.184.189 - dvc [12/Jun/2022:07:44:38 +0100] "PROPFIND /dav/dvc/02 HTTP/1.1" 301 243 "-" "python-httpx/0.23.0"
109.249.184.189 - dvc [12/Jun/2022:07:44:38 +0100] "PROPFIND /dav/dvc/02 HTTP/1.1" 301 243

If I push to the remote folder over ssh then I can pull succesfully over webdav. It is just uploading files to webdav that fails.

I am not sure why MKCOL seems to be happening before PROPFIND for a folder that exists on the server, and I'm also not sure why PROPFIND is returning a 301. From this perhaps the url should have a trailing slash? (ie PROPFIND /dav/dvc/02/ - in that post this changed the 301 which I am seeing to a 405. I have tried adding DavDepthInfinity on but doesn't make any difference to the above (still get 301 on PROPFIND). According to the specs collections should have a trailing slash so I think this could be the reason for the 301 (Moved Permanently).

robince commented 2 years ago

From here I discovered the Apache server DirectorySlash off directive, which fixes the issue for me (now uploads as expected).

  <Directory "/data/dav/dvc">
    DAV On
    AuthType basic
    AuthName "dvc"
    AuthUserFile /data/dav/dvc.passwd
    Require valid-user
    DirectorySlash off
  </Directory>

So can close this issue, but perhaps something could be added to the documentation, or if the trailing slash could still be added to the PROPFIND to ensure standards compliant requests this would avoid the need for this directive.

dtrifiro commented 2 years ago

Thanks for reporting this!

cc @skshetry