Closed amritpurshotam closed 3 years ago
@amritpurshotam can you try to explicitly set the gcp project name via dvc remote modify <remote name> projectname <name of the project>
and try again?
@isidentical I have updated it now but I'm still getting the same error
2021-03-19 19:17:35,418 DEBUG: Check for update is enabled.
2021-03-19 19:17:35,496 DEBUG: Trying to spawn '['daemon', '-q', 'updater']'
2021-03-19 19:17:35,601 DEBUG: Spawned '['daemon', '-q', 'updater']'
2021-03-19 19:17:36,033 DEBUG: Preparing to upload data to 'gs://mlops-example-training-data/dvcstore/'
2021-03-19 19:17:36,034 DEBUG: Preparing to collect status from gs://mlops-example-training-data/dvcstore/
2021-03-19 19:17:36,035 DEBUG: Collecting information from local cache...
2021-03-19 19:17:36,036 DEBUG: Collecting information from remote cache...
2021-03-19 19:17:36,037 DEBUG: Matched '0' indexed hashes
2021-03-19 19:17:36,038 DEBUG: Querying 1 hashes via object_exists
2021-03-19 19:17:38,912 DEBUG: Uploading '.dvc\cache\4e\1504bf50ddc9dd955a8f67aae2f9ec' to 'gs://mlops-example-training-data/dvcstore/4e/1504bf50ddc9dd955a8f67aae2f9ec'
2021-03-19 19:17:38,998 ERROR: failed to upload '.dvc\cache\4e\1504bf50ddc9dd955a8f67aae2f9ec' to 'gs://mlops-example-training-data/dvcstore/4e/1504bf50ddc9dd955a8f67aae2f9ec' - 'NoneType' object has
no attribute 'startswith'
------------------------------------------------------------
Traceback (most recent call last):
File "c:\development\mlops-example\env\lib\site-packages\dvc\fs\fsspec_wrapper.py", line 111, in _upload_fobj
shutil.copyfileobj(fobj, fdest, length=fdest.blocksize)
File "C:\Users\user-pc\AppData\Local\Programs\Python\Python38\lib\shutil.py", line 205, in copyfileobj
fdst_write(buf)
File "c:\development\mlops-example\env\lib\site-packages\fsspec\spec.py", line 1360, in write
self.flush()
File "c:\development\mlops-example\env\lib\site-packages\fsspec\spec.py", line 1395, in flush
self._initiate_upload()
File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 1479, in _initiate_upload
self.location = sync(
File "c:\development\mlops-example\env\lib\site-packages\fsspec\asyn.py", line 71, in sync
raise exc.with_traceback(tb)
File "c:\development\mlops-example\env\lib\site-packages\fsspec\asyn.py", line 55, in f
result[0] = await future
File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 1565, in initiate_upload
headers, _ = await fs._call(
File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 531, in _call
self.validate_response(status, contents, json, path, headers)
File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 1313, in validate_response
raise FileNotFoundError
FileNotFoundError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "c:\development\mlops-example\env\lib\site-packages\dvc\remote\base.py", line 35, in wrapper
func(from_info, to_info, *args, **kwargs)
File "c:\development\mlops-example\env\lib\site-packages\dvc\fs\base.py", line 238, in upload
self._upload( # noqa, pylint: disable=no-member
File "c:\development\mlops-example\env\lib\site-packages\dvc\fs\fsspec_wrapper.py", line 118, in _upload
self.upload_fobj(
File "c:\development\mlops-example\env\lib\site-packages\dvc\fs\base.py", line 252, in upload_fobj
self._upload_fobj(wrapped, to_info) # pylint: disable=no-member
File "c:\development\mlops-example\env\lib\site-packages\dvc\fs\fsspec_wrapper.py", line 111, in _upload_fobj
shutil.copyfileobj(fobj, fdest, length=fdest.blocksize)
File "c:\development\mlops-example\env\lib\site-packages\fsspec\spec.py", line 1557, in __exit__
self.close()
File "c:\development\mlops-example\env\lib\site-packages\fsspec\spec.py", line 1524, in close
self.flush(force=True)
File "c:\development\mlops-example\env\lib\site-packages\fsspec\spec.py", line 1397, in flush
if self._upload_chunk(final=force) is not False:
File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 1448, in _upload_chunk
headers, contents = self.gcsfs.call(
File "c:\development\mlops-example\env\lib\site-packages\fsspec\asyn.py", line 121, in wrapper
return maybe_sync(func, self, *args, **kwargs)
File "c:\development\mlops-example\env\lib\site-packages\fsspec\asyn.py", line 100, in maybe_sync
return sync(loop, func, *args, **kwargs)
File "c:\development\mlops-example\env\lib\site-packages\fsspec\asyn.py", line 71, in sync
raise exc.with_traceback(tb)
File "c:\development\mlops-example\env\lib\site-packages\fsspec\asyn.py", line 55, in f
result[0] = await future
File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 508, in _call
path, jsonin, datain, headers, kwargs = self._get_args(
File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 483, in _get_args
if not path.startswith("http"):
AttributeError: 'NoneType' object has no attribute 'startswith'
------------------------------------------------------------
2021-03-19 19:17:39,007 ERROR: failed to push data to the cloud - 1 files failed to upload
------------------------------------------------------------
Traceback (most recent call last):
File "c:\development\mlops-example\env\lib\site-packages\dvc\command\data_sync.py", line 56, in run
processed_files_count = self.repo.push(
File "c:\development\mlops-example\env\lib\site-packages\dvc\repo\__init__.py", line 49, in wrapper
return f(repo, *args, **kwargs)
File "c:\development\mlops-example\env\lib\site-packages\dvc\repo\push.py", line 41, in push
return len(used_run_cache) + self.cloud.push(used, jobs, remote=remote)
File "c:\development\mlops-example\env\lib\site-packages\dvc\data_cloud.py", line 66, in push
return remote.push(
File "c:\development\mlops-example\env\lib\site-packages\dvc\remote\base.py", line 56, in wrapper
return f(obj, *args, **kwargs)
File "c:\development\mlops-example\env\lib\site-packages\dvc\remote\base.py", line 466, in push
ret = self._process(
File "c:\development\mlops-example\env\lib\site-packages\dvc\remote\base.py", line 342, in _process
self._process_plans(
File "c:\development\mlops-example\env\lib\site-packages\dvc\remote\base.py", line 370, in _process_plans
self._upload_plans(
File "c:\development\mlops-example\env\lib\site-packages\dvc\remote\base.py", line 453, in _upload_plans
raise UploadError(total_fails)
dvc.exceptions.UploadError: 1 files failed to upload
------------------------------------------------------------
2021-03-19 19:17:39,013 DEBUG: Analytics is enabled.
2021-03-19 19:17:39,016 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', 'C:\\Users\\user-pc\\AppData\\Local\\Temp\\tmpxh9s3tvj']'
2021-03-19 19:17:39,115 DEBUG: Spawned '['daemon', '-q', 'analytics', 'C:\\Users\\user-pc\\AppData\\Local\\Temp\\tmpxh9s3tvj']'
I've double checked the .dvc/config
file as well and here it is and the project name is set correctly
[core]
remote = storage
['remote "storage"']
url = gs://mlops-example-training-data/dvcstore/
projectname = mlops-training-example
@isidentical I've inspected the request headers now and can see that it's using the wrong google project.
{
"User-Agent": "python-gcsfs/0.7.2",
"authorization": "Bearer <redacted>",
"x-goog-user-project": "tower-defense-219104"
}
@isidentical I've fixed the problem now. Looks like I already had an Application Default Credential (ADC) set at C:\Users\<username>\AppData\Roaming\gcloud\application_default_credentials.json
which had a quota_project_id
set to an old project and the google.auth
library was then grabbing that. To reset the ADC, I ran gcloud auth application-default login
and re-authorised in the browser.
Bug Report
Description
Reproduce
/data
directoryYou can now commit the changes to git.
+---------------------------------------------------------------------+
What's next?
$ dvc add data/data.csv 100% Add|█████████████|1/1 [00:00, 2.33file/s]
To track the changes with git, run:
$ dvc remote add -d storage gs://mlops-example-training-data/dvcstore Setting 'storage' as a default remote.
$ dvc push -v 2021-03-19 16:46:35,963 DEBUG: Check for update is enabled. 2021-03-19 16:46:36,038 DEBUG: Trying to spawn '['daemon', '-q', 'updater']' 2021-03-19 16:46:36,131 DEBUG: Spawned '['daemon', '-q', 'updater']' 2021-03-19 16:46:36,559 DEBUG: Preparing to upload data to 'gs://mlops-example-training-data/dvcstore' 2021-03-19 16:46:36,561 DEBUG: Preparing to collect status from gs://mlops-example-training-data/dvcstore 2021-03-19 16:46:36,562 DEBUG: Collecting information from local cache... 2021-03-19 16:46:36,564 DEBUG: Collecting information from remote cache... 2021-03-19 16:46:36,565 DEBUG: Matched '0' indexed hashes 2021-03-19 16:46:36,566 DEBUG: Querying 1 hashes via object_exists 2021-03-19 16:46:40,557 DEBUG: Uploading '.dvc\cache\4e\1504bf50ddc9dd955a8f67aae2f9ec' to 'gs://mlops-example-training-data/dvcstore/4e/1504bf50ddc9dd955a8f67aae2f9ec' 2021-03-19 16:46:40,625 ERROR: failed to upload '.dvc\cache\4e\1504bf50ddc9dd955a8f67aae2f9ec' to 'gs://mlops-example-training-data/dvcstore/4e/1504bf50ddc9dd955a8f67aae2f9ec' - 'NoneType' object has no attribute 'startswith'
Traceback (most recent call last): File "c:\development\mlops-example\env\lib\site-packages\dvc\fs\fsspec_wrapper.py", line 111, in _upload_fobj shutil.copyfileobj(fobj, fdest, length=fdest.blocksize) File "C:\Users\user-pc\AppData\Local\Programs\Python\Python38\lib\shutil.py", line 205, in copyfileobj fdst_write(buf) File "c:\development\mlops-example\env\lib\site-packages\fsspec\spec.py", line 1360, in write self.flush() File "c:\development\mlops-example\env\lib\site-packages\fsspec\spec.py", line 1395, in flush self._initiate_upload() File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 1479, in _initiate_upload self.location = sync( File "c:\development\mlops-example\env\lib\site-packages\fsspec\asyn.py", line 71, in sync raise exc.with_traceback(tb) File "c:\development\mlops-example\env\lib\site-packages\fsspec\asyn.py", line 55, in f result[0] = await future File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 1565, in initiateupload headers, = await fs._call( File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 531, in _call self.validate_response(status, contents, json, path, headers) File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 1313, in validate_response raise FileNotFoundError FileNotFoundError
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "c:\development\mlops-example\env\lib\site-packages\dvc\remote\base.py", line 35, in wrapper func(from_info, to_info, *args, kwargs) File "c:\development\mlops-example\env\lib\site-packages\dvc\fs\base.py", line 238, in upload self._upload( # noqa, pylint: disable=no-member File "c:\development\mlops-example\env\lib\site-packages\dvc\fs\fsspec_wrapper.py", line 118, in _upload self.upload_fobj( File "c:\development\mlops-example\env\lib\site-packages\dvc\fs\base.py", line 252, in upload_fobj self._upload_fobj(wrapped, to_info) # pylint: disable=no-member File "c:\development\mlops-example\env\lib\site-packages\dvc\fs\fsspec_wrapper.py", line 111, in _upload_fobj shutil.copyfileobj(fobj, fdest, length=fdest.blocksize) File "c:\development\mlops-example\env\lib\site-packages\fsspec\spec.py", line 1557, in exit self.close() File "c:\development\mlops-example\env\lib\site-packages\fsspec\spec.py", line 1524, in close self.flush(force=True) File "c:\development\mlops-example\env\lib\site-packages\fsspec\spec.py", line 1397, in flush if self._upload_chunk(final=force) is not False: File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 1448, in _upload_chunk headers, contents = self.gcsfs.call( File "c:\development\mlops-example\env\lib\site-packages\fsspec\asyn.py", line 121, in wrapper return maybe_sync(func, self, *args, *kwargs) File "c:\development\mlops-example\env\lib\site-packages\fsspec\asyn.py", line 100, in maybe_sync return sync(loop, func, args, kwargs) File "c:\development\mlops-example\env\lib\site-packages\fsspec\asyn.py", line 71, in sync raise exc.with_traceback(tb) File "c:\development\mlops-example\env\lib\site-packages\fsspec\asyn.py", line 55, in f result[0] = await future File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 508, in _call path, jsonin, datain, headers, kwargs = self._get_args( File "c:\development\mlops-example\env\lib\site-packages\gcsfs\core.py", line 483, in _get_args if not path.startswith("http"): AttributeError: 'NoneType' object has no attribute 'startswith'
2021-03-19 16:46:40,637 ERROR: failed to push data to the cloud - 1 files failed to upload
Traceback (most recent call last): File "c:\development\mlops-example\env\lib\site-packages\dvc\command\data_sync.py", line 56, in run processed_files_count = self.repo.push( File "c:\development\mlops-example\env\lib\site-packages\dvc\repo__init__.py", line 49, in wrapper return f(repo, *args, *kwargs) File "c:\development\mlops-example\env\lib\site-packages\dvc\repo\push.py", line 41, in push return len(used_run_cache) + self.cloud.push(used, jobs, remote=remote) File "c:\development\mlops-example\env\lib\site-packages\dvc\data_cloud.py", line 66, in push return remote.push( File "c:\development\mlops-example\env\lib\site-packages\dvc\remote\base.py", line 56, in wrapper return f(obj, args, **kwargs) File "c:\development\mlops-example\env\lib\site-packages\dvc\remote\base.py", line 466, in push ret = self._process( File "c:\development\mlops-example\env\lib\site-packages\dvc\remote\base.py", line 342, in _process self._process_plans( File "c:\development\mlops-example\env\lib\site-packages\dvc\remote\base.py", line 370, in _process_plans self._upload_plans( File "c:\development\mlops-example\env\lib\site-packages\dvc\remote\base.py", line 453, in _upload_plans raise UploadError(total_fails) dvc.exceptions.UploadError: 1 files failed to upload
2021-03-19 16:46:40,645 DEBUG: Analytics is enabled. 2021-03-19 16:46:40,649 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', 'C:\Users\user-pc\AppData\Local\Temp\tmpbbvblr3k']' 2021-03-19 16:46:40,788 DEBUG: Spawned '['daemon', '-q', 'analytics', 'C:\Users\user-pc\AppData\Local\Temp\tmpbbvblr3k']'
Additional Information (if any): I've tested copying files to my bucket with
gsutil cp
and those work fine so I know it's not a connection or authentication issue.I've continued debugging and looking at the response from the POST API call to
https://www.googleapis.com/upload/storage
this is the response I getand the
bucket
parameter it's sending to the API also appears to have corrupted somewhere as at the time it's making the call it'slops-example-training-datacsv
. The below is the relevant code in thegcsfs
library in thecore.py
file where the API call is being made. Still trying to figure out why the bucket name is different though. As far as I can tell when I trace back up the call stack, the name seems to be correct but still double checking