iterative / dvc

🦉 ML Experiments and Data Management with Git
https://dvc.org
Apache License 2.0
13.6k stars 1.18k forks source link

push: failed to push data to the cloud (GCS) #5657

Closed amritpurshotam closed 3 years ago

amritpurshotam commented 3 years ago

Bug Report

Description

Reproduce

  1. Copy data to /data directory
  2. Run the below commands
    
    $ dvc init
    Initialized DVC repository.

You can now commit the changes to git.

+---------------------------------------------------------------------+ DVC has enabled anonymous aggregate usage analytics. Read the analytics documentation (and how to opt-out) here: https://dvc.org/doc/user-guide/analytics

+---------------------------------------------------------------------+

What's next?

$ dvc add data/data.csv 100% Add|█████████████|1/1 [00:00, 2.33file/s]

To track the changes with git, run:

    git add 'data\.gitignore' 'data\data.csv.dvc'

$ dvc remote add -d storage gs://mlops-example-training-data/dvcstore Setting 'storage' as a default remote.

$ dvc push -v 2021-03-19 16:46:35,963 DEBUG: Check for update is enabled. 2021-03-19 16:46:36,038 DEBUG: Trying to spawn '['daemon', '-q', 'updater']' 2021-03-19 16:46:36,131 DEBUG: Spawned '['daemon', '-q', 'updater']' 2021-03-19 16:46:36,559 DEBUG: Preparing to upload data to 'gs://mlops-example-training-data/dvcstore' 2021-03-19 16:46:36,561 DEBUG: Preparing to collect status from gs://mlops-example-training-data/dvcstore 2021-03-19 16:46:36,562 DEBUG: Collecting information from local cache... 2021-03-19 16:46:36,564 DEBUG: Collecting information from remote cache... 2021-03-19 16:46:36,565 DEBUG: Matched '0' indexed hashes 2021-03-19 16:46:36,566 DEBUG: Querying 1 hashes via object_exists 2021-03-19 16:46:40,557 DEBUG: Uploading '.dvc\cache\4e\1504bf50ddc9dd955a8f67aae2f9ec' to 'gs://mlops-example-training-data/dvcstore/4e/1504bf50ddc9dd955a8f67aae2f9ec' 2021-03-19 16:46:40,625 ERROR: failed to upload '.dvc\cache\4e\1504bf50ddc9dd955a8f67aae2f9ec' to 'gs://mlops-example-training-data/dvcstore/4e/1504bf50ddc9dd955a8f67aae2f9ec' - 'NoneType' object has no attribute 'startswith'

Traceback (most recent call last): File "c:\development\mlops-example\env\lib\site-packages\dvc\fs\fsspec_wrapper.py", line 111, in _upload_fobj shutil.copyfileobj(fobj, fdest, length=fdest.blocksize) File "C:\Users\user-pc\AppData\Local\Programs\Python\Python38\lib\shutil.py", line 205, in copyfileobj fdst_write(buf) File "c:\development\mlops-example\env\lib\site-packages\fsspec\spec.py", line 1360, in write self.flush() File "c:\development\mlops-example\env\lib\site-packages\fsspec\spec.py", line 1395, in flush self._initiate_upload() File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 1479, in _initiate_upload self.location = sync( File "c:\development\mlops-example\env\lib\site-packages\fsspec\asyn.py", line 71, in sync raise exc.with_traceback(tb) File "c:\development\mlops-example\env\lib\site-packages\fsspec\asyn.py", line 55, in f result[0] = await future File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 1565, in initiateupload headers, = await fs._call( File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 531, in _call self.validate_response(status, contents, json, path, headers) File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 1313, in validate_response raise FileNotFoundError FileNotFoundError

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "c:\development\mlops-example\env\lib\site-packages\dvc\remote\base.py", line 35, in wrapper func(from_info, to_info, *args, kwargs) File "c:\development\mlops-example\env\lib\site-packages\dvc\fs\base.py", line 238, in upload self._upload( # noqa, pylint: disable=no-member File "c:\development\mlops-example\env\lib\site-packages\dvc\fs\fsspec_wrapper.py", line 118, in _upload self.upload_fobj( File "c:\development\mlops-example\env\lib\site-packages\dvc\fs\base.py", line 252, in upload_fobj self._upload_fobj(wrapped, to_info) # pylint: disable=no-member File "c:\development\mlops-example\env\lib\site-packages\dvc\fs\fsspec_wrapper.py", line 111, in _upload_fobj shutil.copyfileobj(fobj, fdest, length=fdest.blocksize) File "c:\development\mlops-example\env\lib\site-packages\fsspec\spec.py", line 1557, in exit self.close() File "c:\development\mlops-example\env\lib\site-packages\fsspec\spec.py", line 1524, in close self.flush(force=True) File "c:\development\mlops-example\env\lib\site-packages\fsspec\spec.py", line 1397, in flush if self._upload_chunk(final=force) is not False: File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 1448, in _upload_chunk headers, contents = self.gcsfs.call( File "c:\development\mlops-example\env\lib\site-packages\fsspec\asyn.py", line 121, in wrapper return maybe_sync(func, self, *args, *kwargs) File "c:\development\mlops-example\env\lib\site-packages\fsspec\asyn.py", line 100, in maybe_sync return sync(loop, func, args, kwargs) File "c:\development\mlops-example\env\lib\site-packages\fsspec\asyn.py", line 71, in sync raise exc.with_traceback(tb) File "c:\development\mlops-example\env\lib\site-packages\fsspec\asyn.py", line 55, in f result[0] = await future File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 508, in _call path, jsonin, datain, headers, kwargs = self._get_args( File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 483, in _get_args if not path.startswith("http"): AttributeError: 'NoneType' object has no attribute 'startswith'

2021-03-19 16:46:40,637 ERROR: failed to push data to the cloud - 1 files failed to upload

Traceback (most recent call last): File "c:\development\mlops-example\env\lib\site-packages\dvc\command\data_sync.py", line 56, in run processed_files_count = self.repo.push( File "c:\development\mlops-example\env\lib\site-packages\dvc\repo__init__.py", line 49, in wrapper return f(repo, *args, *kwargs) File "c:\development\mlops-example\env\lib\site-packages\dvc\repo\push.py", line 41, in push return len(used_run_cache) + self.cloud.push(used, jobs, remote=remote) File "c:\development\mlops-example\env\lib\site-packages\dvc\data_cloud.py", line 66, in push return remote.push( File "c:\development\mlops-example\env\lib\site-packages\dvc\remote\base.py", line 56, in wrapper return f(obj, args, **kwargs) File "c:\development\mlops-example\env\lib\site-packages\dvc\remote\base.py", line 466, in push ret = self._process( File "c:\development\mlops-example\env\lib\site-packages\dvc\remote\base.py", line 342, in _process self._process_plans( File "c:\development\mlops-example\env\lib\site-packages\dvc\remote\base.py", line 370, in _process_plans self._upload_plans( File "c:\development\mlops-example\env\lib\site-packages\dvc\remote\base.py", line 453, in _upload_plans raise UploadError(total_fails) dvc.exceptions.UploadError: 1 files failed to upload

2021-03-19 16:46:40,645 DEBUG: Analytics is enabled. 2021-03-19 16:46:40,649 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', 'C:\Users\user-pc\AppData\Local\Temp\tmpbbvblr3k']' 2021-03-19 16:46:40,788 DEBUG: Spawned '['daemon', '-q', 'analytics', 'C:\Users\user-pc\AppData\Local\Temp\tmpbbvblr3k']'


### Expected

<!--
A clear and concise description of what you expect to happen.
-->

My data to sync with my Google Cloud Storage bucket.

### Environment information

<!--
This is required to ensure that we can reproduce the bug.
-->

**Output of `dvc doctor`:**

```console
DVC version: 2.0.6 (pip)
---------------------------------
Platform: Python 3.8.5 on Windows-10-10.0.19041-SP0
Supports: gs, http, https
Cache types: hardlink
Cache directory: NTFS on C:\
Caches: local
Remotes: gs
Workspace directory: NTFS on C:\
Repo: dvc, git

Additional Information (if any): I've tested copying files to my bucket with gsutil cp and those work fine so I know it's not a connection or authentication issue.

I've continued debugging and looking at the response from the POST API call to https://www.googleapis.com/upload/storage this is the response I get

{
    "error": {
        "code": 404,
        "message": "The requested project was not found.",
        "errors": [{
                "message": "The requested project was not found.",
                "domain": "global",
                "reason": "notFound"
            }
        ]
    }
}

and the bucket parameter it's sending to the API also appears to have corrupted somewhere as at the time it's making the call it's lops-example-training-datacsv. The below is the relevant code in the gcsfs library in the core.py file where the API call is being made. Still trying to figure out why the bucket name is different though. As far as I can tell when I trace back up the call stack, the name seems to be correct but still double checking

async def initiate_upload(
    fs, bucket, key, content_type="application/octet-stream", metadata=None
):
    j = {"name": key}
    if metadata:
        j["metadata"] = metadata
    headers, _ = await fs._call(
        method="POST",
        path="https://www.googleapis.com/upload/storage"
        "/v1/b/%s/o" % quote_plus(bucket),
        uploadType="resumable",
        json=j,
        headers={"X-Upload-Content-Type": content_type},
    )
    loc = headers["Location"]
    out = loc[0] if isinstance(loc, list) else loc  # <- for CVR responses
    if len(str(loc)) < 20:
        logger.error("Location failed: %s" % headers)
    return out
isidentical commented 3 years ago

@amritpurshotam can you try to explicitly set the gcp project name via dvc remote modify <remote name> projectname <name of the project> and try again?

amritpurshotam commented 3 years ago

@isidentical I have updated it now but I'm still getting the same error

2021-03-19 19:17:35,418 DEBUG: Check for update is enabled.
2021-03-19 19:17:35,496 DEBUG: Trying to spawn '['daemon', '-q', 'updater']'
2021-03-19 19:17:35,601 DEBUG: Spawned '['daemon', '-q', 'updater']'  
2021-03-19 19:17:36,033 DEBUG: Preparing to upload data to 'gs://mlops-example-training-data/dvcstore/'
2021-03-19 19:17:36,034 DEBUG: Preparing to collect status from gs://mlops-example-training-data/dvcstore/
2021-03-19 19:17:36,035 DEBUG: Collecting information from local cache...
2021-03-19 19:17:36,036 DEBUG: Collecting information from remote cache...
2021-03-19 19:17:36,037 DEBUG: Matched '0' indexed hashes
2021-03-19 19:17:36,038 DEBUG: Querying 1 hashes via object_exists
2021-03-19 19:17:38,912 DEBUG: Uploading '.dvc\cache\4e\1504bf50ddc9dd955a8f67aae2f9ec' to 'gs://mlops-example-training-data/dvcstore/4e/1504bf50ddc9dd955a8f67aae2f9ec'
2021-03-19 19:17:38,998 ERROR: failed to upload '.dvc\cache\4e\1504bf50ddc9dd955a8f67aae2f9ec' to 'gs://mlops-example-training-data/dvcstore/4e/1504bf50ddc9dd955a8f67aae2f9ec' - 'NoneType' object has 
no attribute 'startswith'
------------------------------------------------------------
Traceback (most recent call last):
  File "c:\development\mlops-example\env\lib\site-packages\dvc\fs\fsspec_wrapper.py", line 111, in _upload_fobj
    shutil.copyfileobj(fobj, fdest, length=fdest.blocksize)
  File "C:\Users\user-pc\AppData\Local\Programs\Python\Python38\lib\shutil.py", line 205, in copyfileobj
    fdst_write(buf)
  File "c:\development\mlops-example\env\lib\site-packages\fsspec\spec.py", line 1360, in write
    self.flush()
  File "c:\development\mlops-example\env\lib\site-packages\fsspec\spec.py", line 1395, in flush
    self._initiate_upload()
  File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 1479, in _initiate_upload
    self.location = sync(
  File "c:\development\mlops-example\env\lib\site-packages\fsspec\asyn.py", line 71, in sync
    raise exc.with_traceback(tb)
  File "c:\development\mlops-example\env\lib\site-packages\fsspec\asyn.py", line 55, in f
    result[0] = await future
  File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 1565, in initiate_upload
    headers, _ = await fs._call(
  File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 531, in _call
    self.validate_response(status, contents, json, path, headers)
  File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 1313, in validate_response
    raise FileNotFoundError
FileNotFoundError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "c:\development\mlops-example\env\lib\site-packages\dvc\remote\base.py", line 35, in wrapper
    func(from_info, to_info, *args, **kwargs)
  File "c:\development\mlops-example\env\lib\site-packages\dvc\fs\base.py", line 238, in upload
    self._upload(  # noqa, pylint: disable=no-member
  File "c:\development\mlops-example\env\lib\site-packages\dvc\fs\fsspec_wrapper.py", line 118, in _upload
    self.upload_fobj(
  File "c:\development\mlops-example\env\lib\site-packages\dvc\fs\base.py", line 252, in upload_fobj
    self._upload_fobj(wrapped, to_info)  # pylint: disable=no-member
  File "c:\development\mlops-example\env\lib\site-packages\dvc\fs\fsspec_wrapper.py", line 111, in _upload_fobj
    shutil.copyfileobj(fobj, fdest, length=fdest.blocksize)
  File "c:\development\mlops-example\env\lib\site-packages\fsspec\spec.py", line 1557, in __exit__
    self.close()
  File "c:\development\mlops-example\env\lib\site-packages\fsspec\spec.py", line 1524, in close
    self.flush(force=True)
  File "c:\development\mlops-example\env\lib\site-packages\fsspec\spec.py", line 1397, in flush
    if self._upload_chunk(final=force) is not False:
  File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 1448, in _upload_chunk
    headers, contents = self.gcsfs.call(
  File "c:\development\mlops-example\env\lib\site-packages\fsspec\asyn.py", line 121, in wrapper
    return maybe_sync(func, self, *args, **kwargs)
  File "c:\development\mlops-example\env\lib\site-packages\fsspec\asyn.py", line 100, in maybe_sync
    return sync(loop, func, *args, **kwargs)
  File "c:\development\mlops-example\env\lib\site-packages\fsspec\asyn.py", line 71, in sync
    raise exc.with_traceback(tb)
  File "c:\development\mlops-example\env\lib\site-packages\fsspec\asyn.py", line 55, in f
    result[0] = await future
  File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 508, in _call
    path, jsonin, datain, headers, kwargs = self._get_args(
  File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 483, in _get_args
    if not path.startswith("http"):
AttributeError: 'NoneType' object has no attribute 'startswith'
------------------------------------------------------------
2021-03-19 19:17:39,007 ERROR: failed to push data to the cloud - 1 files failed to upload
------------------------------------------------------------
Traceback (most recent call last):
  File "c:\development\mlops-example\env\lib\site-packages\dvc\command\data_sync.py", line 56, in run
    processed_files_count = self.repo.push(
  File "c:\development\mlops-example\env\lib\site-packages\dvc\repo\__init__.py", line 49, in wrapper
    return f(repo, *args, **kwargs)
  File "c:\development\mlops-example\env\lib\site-packages\dvc\repo\push.py", line 41, in push
    return len(used_run_cache) + self.cloud.push(used, jobs, remote=remote)
  File "c:\development\mlops-example\env\lib\site-packages\dvc\data_cloud.py", line 66, in push
    return remote.push(
  File "c:\development\mlops-example\env\lib\site-packages\dvc\remote\base.py", line 56, in wrapper
    return f(obj, *args, **kwargs)
  File "c:\development\mlops-example\env\lib\site-packages\dvc\remote\base.py", line 466, in push
    ret = self._process(
  File "c:\development\mlops-example\env\lib\site-packages\dvc\remote\base.py", line 342, in _process
    self._process_plans(
  File "c:\development\mlops-example\env\lib\site-packages\dvc\remote\base.py", line 370, in _process_plans
    self._upload_plans(
  File "c:\development\mlops-example\env\lib\site-packages\dvc\remote\base.py", line 453, in _upload_plans
    raise UploadError(total_fails)
dvc.exceptions.UploadError: 1 files failed to upload
------------------------------------------------------------
2021-03-19 19:17:39,013 DEBUG: Analytics is enabled.
2021-03-19 19:17:39,016 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', 'C:\\Users\\user-pc\\AppData\\Local\\Temp\\tmpxh9s3tvj']'
2021-03-19 19:17:39,115 DEBUG: Spawned '['daemon', '-q', 'analytics', 'C:\\Users\\user-pc\\AppData\\Local\\Temp\\tmpxh9s3tvj']'

I've double checked the .dvc/config file as well and here it is and the project name is set correctly

[core]
    remote = storage
['remote "storage"']
    url = gs://mlops-example-training-data/dvcstore/
    projectname = mlops-training-example
amritpurshotam commented 3 years ago

@isidentical I've inspected the request headers now and can see that it's using the wrong google project.

{
    "User-Agent": "python-gcsfs/0.7.2",
    "authorization": "Bearer <redacted>",
    "x-goog-user-project": "tower-defense-219104"
}
amritpurshotam commented 3 years ago

@isidentical I've fixed the problem now. Looks like I already had an Application Default Credential (ADC) set at C:\Users\<username>\AppData\Roaming\gcloud\application_default_credentials.json which had a quota_project_id set to an old project and the google.auth library was then grabbing that. To reset the ADC, I ran gcloud auth application-default login and re-authorised in the browser.