iterative / dvc

🦉 ML Experiments and Data Management with Git
https://dvc.org
Apache License 2.0
13.43k stars 1.17k forks source link

webdav: SSL verification error only available with -v #10076

Open mdekstrand opened 8 months ago

mdekstrand commented 8 months ago

Bug Report

Description

When using the 'webdavs' remote, if the connection fails due to an SSL verification error, the error cause is hidden and DVC fails with an internal error message.

Reproduce

  1. Add a webdavs:// remote that points to a server with a self-signed certificate
  2. Attempt to pull from the remote

Expected

DVC to fail with an error message stating that the SSL verification failed.

This information is currently only available when running with -v.

Environment information

DVC version: 3.28.0 (conda)
---------------------------
Platform: Python 3.10.13 on macOS-13.6.1-arm64-arm-64bit
Subprojects:
    dvc_data = 2.20.0
    dvc_objects = 1.1.0
    dvc_render = 0.6.0
    dvc_task = 0.3.0
    scmrepo = 1.4.1
Supports:
    http (aiohttp = 3.8.6, aiohttp-retry = 2.8.3),
    https (aiohttp = 3.8.6, aiohttp-retry = 2.8.3),
    ssh (sshfs = 2023.10.0),
    webdav (webdav4 = 0.9.8),
    webdavs (webdav4 = 0.9.8)
Config:
    Global: /Users/mde48/Library/Application Support/dvc
    System: /Library/Application Support/dvc
Cache types: <https://error.dvc.org/no-dvc-cache>
Caches: local
Remotes: webdavs
Workspace directory: apfs on /dev/disk3s1s1
Repo: dvc, git
Repo.site_cache_dir: /Library/Caches/dvc/repo/7db9a4553acc84ea09f78eaae3e0781d
efiop commented 8 months ago

@mdekstrand Could you post an example, please? Unfortunately, I doubt anyone is going to actively go and try reproducing it any time soon.

If you see a nice way to improve that, contributions are welcome.

hqdncw commented 8 months ago

Hi @mdekstrand,

Thanks for sharing the steps to reproduce the issue. However, I couldn't replicate the problem using those steps because it seems that something needs to be pushed to the remote storage first. If nothing has been pushed yet, then there's no error. But if we try to push changes to the remote storage, the issue can be reproduced as well.

Here are the detailed steps I took:

  1. Run the following command to generate a certificate and private key for use with SFTPGo.

    openssl req -x509 -out secrets/localhost.crt -keyout secrets/localhost.key \
    -newkey rsa:4096 -nodes -sha256 \
    -subj '/CN=localhost' -extensions EXT -config <( \
    printf "[dn]\nCN=localhost\n[req]\ndistinguished_name = dn\n[EXT]\nsubjectAltName=DNS:localhost\nkeyUsage=digitalSignature\nextendedKeyUsage=serverAuth")
  2. Set up an SFTPGo server that supports WebDAV. This server will listen on a specific port number (10443) on your local machine. Configure the server to use the SSL/TLS certificate and private key created in step 1 to enable secure connections over HTTPS.

    
    #!/usr/bin/env bash

CONTAINER_NAME="webdav" SECRETS_DIR="/usr/local/src/secrets"

graceful_shutdown() { docker stop $CONTAINER_NAME && docker rm $CONTAINER_NAME exit } trap graceful_shutdown INT TERM

docker create --name $CONTAINER_NAME \ -a STDOUT \ -p 8080:8080 \ -p 2022:2022 \ -p 10443:10443 \ -e SFTPGO_WEBDAVDBINDINGS0PORT=10443 \ -e SFTPGO_WEBDAVDBINDINGS0ENABLE_HTTPS=true \ -e SFTPGO_WEBDAVDBINDINGS0CERTIFICATE_FILE=$SECRETS_DIR/localhost.crt \ -e SFTPGO_WEBDAVDBINDINGS0CERTIFICATE_KEY_FILE=$SECRETS_DIR/localhost.key \ -t "drakkan/sftpgo:v2.5.5" || exit

docker cp -q ./secrets/ $CONTAINER_NAME:$SECRETS_DIR || exit docker start --attach $CONTAINER_NAME || exit


3. Make sure the server is using the certificate you provided
```bash
openssl s_client -showcerts -connect localhost:10443 </dev/null
  1. Initialize DVC.

    dvc init --no-scm
  2. Track some changes.

    touch example.xml && dvc add example.xml
  3. Add the remote.

    dvc remote add test webdavs://localhost:10443
  4. Try to push changes to the remote.

    dvc push --remote test
Actual result ```bash $ dvc push --verbose --remote test 2023-11-18 18:11:26,480 DEBUG: v3.28.0, CPython 3.11.2 on 2023-11-18 18:11:26,480 DEBUG: command: push --verbose 2023-11-18 18:11:26,701 DEBUG: Preparing to transfer data from '/home/sid/workspace/test2/.dvc/cache/files/md5' to 'https://localhost:10443/files/md5' 2023-11-18 18:11:26,702 DEBUG: Preparing to collect status from 'files/md5' 2023-11-18 18:11:26,702 DEBUG: Collecting status from 'files/md5' 2023-11-18 18:11:26,703 DEBUG: Querying 1 oids via object_exists 2023-11-18 18:11:26,820 DEBUG: Preparing to collect status from '/home/sid/workspace/test2/.dvc/cache/files/md5' 2023-11-18 18:11:26,820 DEBUG: Collecting status from '/home/sid/workspace/test2/.dvc/cache/files/md5' 2023-11-18 18:11:26,869 ERROR: unexpected error - [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate (_ssl.c:992): [SSL: CERTIFICATE_VERI FY_FAILED] certificate verify failed: self-signed certificate (_ssl.c:992): [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate (_ssl.c:99 2) Traceback (most recent call last): File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/httpcore/_exceptions.py", line 10, in map_exceptions yield File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/httpcore/_backends/sync.py", line 168, in start_tls raise exc File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/httpcore/_backends/sync.py", line 163, in start_tls sock = ssl_context.wrap_socket( ^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/ssl.py", line 517, in wrap_socket return self.sslsocket_class._create( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/ssl.py", line 1075, in _create self.do_handshake() File "/usr/lib/python3.11/ssl.py", line 1346, in do_handshake self._sslobj.do_handshake() ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate (_ssl.c:992) The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/httpx/_transports/default.py", line 66, in map_httpcore_exceptions yield File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/httpx/_transports/default.py", line 228, in handle_request resp = self._pool.handle_request(req) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/httpcore/_sync/connection_pool.py", line 268, in handle_request raise exc File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/httpcore/_sync/connection_pool.py", line 251, in handle_request response = connection.handle_request(request) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/httpcore/_sync/connection.py", line 99, in handle_request raise exc File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/httpcore/_sync/connection.py", line 76, in handle_request stream = self._connect(request) ^^^^^^^^^^^^^^^^^^^^^^ File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/httpcore/_sync/connection.py", line 156, in _connect stream = stream.start_tls(**kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/httpcore/_backends/sync.py", line 152, in start_tls with map_exceptions(exc_map): File "/usr/lib/python3.11/contextlib.py", line 155, in __exit__ self.gen.throw(typ, value, traceback) File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions raise to_exc(exc) from exc httpcore.ConnectError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate (_ssl.c:992) The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/sid/workspace/dvc/dvc/cli/__init__.py", line 209, in main ret = cmd.do_run() ^^^^^^^^^^^^ File "/home/sid/workspace/dvc/dvc/cli/command.py", line 26, in do_run return self.run() ^^^^^^^^^^ File "/home/sid/workspace/dvc/dvc/commands/data_sync.py", line 64, in run processed_files_count = self.repo.push( ^^^^^^^^^^^^^^^ File "/home/sid/workspace/dvc/dvc/repo/__init__.py", line 61, in wrapper return f(repo, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sid/workspace/dvc/dvc/repo/push.py", line 115, in push push_transferred, push_failed = ipush( ^^^^^^ File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/dvc_data/index/push.py", line 42, in push result = transfer( ^^^^^^^^^ File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/dvc_data/hashfile/transfer.py", line 225, in transfer failed = _do_transfer( ^^^^^^^^^^^^^ File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/dvc_data/hashfile/transfer.py", line 123, in _do_transfer failed_ids.update(_add(src, dest, all_file_ids, **kwargs)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/dvc_data/hashfile/transfer.py", line 166, in _add dest.add( File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/dvc_data/hashfile/db/__init__.py", line 110, in add transferred = super().add( ^^^^^^^^^^^^ File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/dvc_objects/db.py", line 162, in add self._init(parts) File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/dvc_objects/db.py", line 67, in _init for path in self.fs.ls(self.path, detail=False) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/dvc_objects/fs/base.py", line 425, in ls return self.fs.ls(path, detail=detail, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/contextlib.py", line 81, in inner return func(*args, **kwds) ^^^^^^^^^^^^^^^^^^^ File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/webdav4/fsspec.py", line 117, in ls data = self.client.ls( ^^^^^^^^^^^^^^^ File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/webdav4/client.py", line 510, in ls result = self.propfind( ^^^^^^^^^^^^^^ File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/webdav4/client.py", line 318, in propfind http_resp = self.with_retry(call) ^^^^^^^^^^^^^^^^^^^^^ File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/webdav4/func_utils.py", line 44, in wrapped_function return func() ^^^^^^ File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/webdav4/func_utils.py", line 68, in wrapped return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/webdav4/client.py", line 362, in _request http_resp = self.http.request(method, url, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/httpx/_client.py", line 814, in request return self.send(request, auth=auth, follow_redirects=follow_redirects) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/httpx/_client.py", line 901, in send response = self._send_handling_auth( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/httpx/_client.py", line 929, in _send_handling_auth response = self._send_handling_redirects( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/httpx/_client.py", line 966, in _send_handling_redirects response = self._send_single_request(request) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/httpx/_client.py", line 1002, in _send_single_request response = transport.handle_request(request) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/httpx/_transports/default.py", line 227, in handle_request with map_httpcore_exceptions(): File "/usr/lib/python3.11/contextlib.py", line 155, in __exit__ self.gen.throw(typ, value, traceback) File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/httpx/_transports/default.py", line 83, in map_httpcore_exceptions raise mapped_exc(message) from exc httpx.ConnectError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate (_ssl.c:992) 2023-11-18 18:11:26,916 DEBUG: link type reflink is not available ([Errno 95] no more link types left to try out) 2023-11-18 18:11:26,916 DEBUG: Removing '/home/sid/workspace/.HMJK2Xt7dTDwuiabtdYWRh.tmp' 2023-11-18 18:11:26,916 DEBUG: Removing '/home/sid/workspace/.HMJK2Xt7dTDwuiabtdYWRh.tmp' 2023-11-18 18:11:26,916 DEBUG: Removing '/home/sid/workspace/.HMJK2Xt7dTDwuiabtdYWRh.tmp' 2023-11-18 18:11:26,916 DEBUG: Removing '/home/sid/workspace/test2/.dvc/cache/files/md5/.NfyncAmxhrw2j8BWgewtgV.tmp' 2023-11-18 18:11:26,918 DEBUG: Version info for developers: DVC version: 3.28.0 ------------------- Platform: Python 3.11.2 on Subprojects: dvc_data = 2.20.0 dvc_objects = 1.1.0 dvc_render = 0.6.0 dvc_task = 0.3.0 scmrepo = 1.4.1 Supports: azure (adlfs = 2023.10.0, knack = 0.11.0, azure-identity = 1.15.0), gdrive (pydrive2 = 1.17.0), gs (gcsfs = 2023.9.2), hdfs (fsspec = 2023.9.2, pyarrow = 14.0.1), http (aiohttp = 3.8.6, aiohttp-retry = 2.8.3), https (aiohttp = 3.8.6, aiohttp-retry = 2.8.3), oss (ossfs = 2021.8.0), s3 (s3fs = 2023.9.2, boto3 = 1.28.17), ssh (sshfs = 2023.10.0), webdav (webdav4 = 0.9.8), webdavs (webdav4 = 0.9.8), webhdfs (fsspec = 2023.9.2) Config: Global: /home/sid/.config/dvc System: /etc/xdg/dvc Cache types: hardlink, symlink Cache directory: ext4 on /dev/sda9 Caches: local Remotes: webdavs Workspace directory: ext4 on /dev/sda9 Repo: dvc (no_scm) Repo.site_cache_dir: /var/tmp/dvc/repo/565945b4dca6a1d7be16ba663c6af60c Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help! 2023-11-18 18:11:26,919 DEBUG: Analytics is disabled. ```
mdekstrand commented 8 months ago

Thanks @hqdncw! That's exactly the error situation I was seeing.