googleapis / google-cloud-python

Google Cloud Client Library for Python
https://googleapis.github.io/google-cloud-python/
Apache License 2.0
4.83k stars 1.52k forks source link

blob.download_to_filename fails with google-cloud-storage==1.3.0 #3736

Closed sonlac closed 7 years ago

sonlac commented 7 years ago

Due to the notable implementation changes in google-cloud-storage 1.3.0, the _blob.download_tofilename fails whatever google-cloud from 0.24.0 to 0.27.0 (latest)

Our prod server which using google-cloud==0.24.0 has been broken since last Saturday. I analyzed a bit the error. The problem is come from the upgraded package google-cloud-storage==1.3.0. We are using the google-cloud-storage==1.2.0.

I did several tests which figured out the following problems:

Problem 1: _blob.download_tofilename failed when using google-cloud==0.27.0 (included google-cloud-storage==1.3.0 as default, defined in setup.py). Here is a part of stacktrace:

blob.download_to_filename(blob_local_path)
  File "/root/.local/lib/python2.7/site-packages/google/cloud/storage/blob.py", line 482, in download_to_filename
    self.download_to_file(file_obj, client=client)
  File "/root/.local/lib/python2.7/site-packages/google/cloud/storage/blob.py", line 464, in download_to_file
    self._do_download(transport, file_obj, download_url, headers)
  File "/root/.local/lib/python2.7/site-packages/google/cloud/storage/blob.py", line 418, in _do_download
    download.consume(transport)
  File "/root/.local/lib/python2.7/site-packages/google/resumable_media/requests/download.py", line 101, in consume
    self._write_to_stream(result)
  File "/root/.local/lib/python2.7/site-packages/google/resumable_media/requests/download.py", line 62, in _write_to_stream
    with response:
AttributeError: __exit__

Problem 2: _blob.download_tofilename failed when using google-cloud==0.24.0 (included google-cloud-storage==1.3.0 as default, defined in setup.py). Here is a part of stacktrace:

File "dev_env/python_venv/local/lib/python2.7/site-packages/google/cloud/storage/blob.py", line 482, in download_to_filename
    self.download_to_file(file_obj, client=client)
  File "dev_env/python_venv/local/lib/python2.7/site-packages/google/cloud/storage/blob.py", line 464, in download_to_file
    self._do_download(transport, file_obj, download_url, headers)
  File "dev_env/python_venv/local/lib/python2.7/site-packages/google/cloud/storage/blob.py", line 418, in _do_download
    download.consume(transport)
  File "dev_env/python_venv/local/lib/python2.7/site-packages/google/resumable_media/requests/download.py", line 96, in consume
    transport, method, url, **request_kwargs)
  File "dev_env/python_venv/local/lib/python2.7/site-packages/google/resumable_media/requests/_helpers.py", line 101, in http_request
    func, RequestsMixin._get_status_code, retry_strategy)
  File "dev_env/python_venv/local/lib/python2.7/site-packages/google/resumable_media/_helpers.py", line 146, in wait_and_retry
    response = func()
  File "dev_env/python_venv/local/lib/python2.7/site-packages/google_auth_httplib2.py", line 198, in request
    uri, method, body=body, headers=request_headers, **kwargs)
TypeError: request() got an unexpected keyword argument 'data'

So, why it happened? I realized that in the REQUIREMENTS of the package google-cloud which is defined in the file setup.py. An example for google-cloud==0.27.0 https://github.com/GoogleCloudPlatform/google-cloud-python/blob/master/setup.py#L67

We can see that if we only define google-cloud==[0.24.0 to 0.27.0], "pip install" always try to install google-cloud-storage 1.3.0 which requires the dependencies google-cloud-core ~= 0.26. That's why the bug happens.

Solution:

Suggestion: better manage the deps for google-cloud in setup.py.

Thanks.

dhermes commented 7 years ago

@sonlac It looks like you're using a custom transport. When constructing Client, are you passing a custom _http argument? (Note that the leading underscore in _http is a "user-beware" signal.)

For reference, see the release notes:

sonlac commented 7 years ago

@dhermes Thank you for your quick reply. No, I didn't pass anything as custom argument. I am using the latest code to test, all arguments are default ones. It seems that this change https://github.com/GoogleCloudPlatform/google-cloud-python/pull/3705/files makes the google-cloud-storage 1.3.0 broken? For my part, I am trying to get the logs for more details...

dhermes commented 7 years ago

Before making a request, can you print client._http?

dhermes commented 7 years ago

RE: #3705, notice the code in _make_transport and compare it to the _http property. You'll see they are the same. Also, using the default _http, our system tests are passing.

lukesneeringer commented 7 years ago

@sonlac Can you give us the output of pip freeze? In particular, I would like to see what version of requests you have installed. I am thinking our lower bound might be too low.

If it is a semi-old one, can you also try updating requests to the newest version and tell us if that fixes your issue?

sonlac commented 7 years ago

Thank you @dhermes @lukesneeringer for the comments. So, the problem 1 is about "user-beware". It's was not a bug. For the problem 2, I could not reproduce it in a context "out-of-box". I've just tested a simple code using download_to_filename, it worked with latest ones: google-cloud 0.27.0 and goole-cloud-storage 1.3.0. In fact, I've run the code inside a ML engine, so it'd take time to get more logs... I'll back to you for more information.

pip freeze

Click to expand
cachetools==2.0.0
certifi==2017.7.27.1
chardet==3.0.4
dill==0.2.7.1
enum34==1.1.6
future==0.16.0
futures==3.1.1
gapic-google-cloud-datastore-v1==0.15.3
gapic-google-cloud-error-reporting-v1beta1==0.15.3
gapic-google-cloud-logging-v2==0.91.3
gapic-google-cloud-pubsub-v1==0.15.4
gapic-google-cloud-spanner-admin-database-v1==0.15.3
gapic-google-cloud-spanner-admin-instance-v1==0.15.3
gapic-google-cloud-spanner-v1==0.15.3
google-auth==1.0.2
google-cloud==0.27.0
google-cloud-bigquery==0.26.0
google-cloud-bigtable==0.26.0
google-cloud-core==0.26.0
google-cloud-datastore==1.2.0
google-cloud-dns==0.26.0
google-cloud-error-reporting==0.26.0
google-cloud-language==0.27.0
google-cloud-logging==1.2.0
google-cloud-monitoring==0.26.0
google-cloud-pubsub==0.27.0
google-cloud-resource-manager==0.26.0
google-cloud-runtimeconfig==0.26.0
google-cloud-spanner==0.26.0
google-cloud-speech==0.28.0
google-cloud-storage==1.3.0
google-cloud-translate==1.1.0
google-cloud-videointelligence==0.25.0
google-cloud-vision==0.26.0
google-gax==0.15.13
google-resumable-media==0.2.2
googleapis-common-protos==1.5.2
grpc-google-iam-v1==0.11.1
grpcio==1.4.0
httplib2==0.10.3
idna==2.5
monotonic==1.3
oauth2client==3.0.0
ply==3.8
proto-google-cloud-datastore-v1==0.90.4
proto-google-cloud-error-reporting-v1beta1==0.15.3
proto-google-cloud-logging-v2==0.91.3
proto-google-cloud-pubsub-v1==0.15.4
proto-google-cloud-spanner-admin-database-v1==0.15.3
proto-google-cloud-spanner-admin-instance-v1==0.15.3
proto-google-cloud-spanner-v1==0.15.3
protobuf==3.3.0
pyasn1==0.3.2
pyasn1-modules==0.0.11
requests==2.18.3
rsa==3.4.2
six==1.10.0
tenacity==4.4.0
urllib3==1.22
dhermes commented 7 years ago

@sonlac I am going to preemptively close this based on

I could not reproduce it in a context "out-of-box"

We're happy to keep discussing and re-open if there does turn out to be some issue.

guy-shahine commented 7 years ago

Hey guys, faced same issue here. Code ran fine on my local mac machine and ubuntu VM in Compute Engine, but failed on Debian VM in Compute engine that was deployed through Salt.

pip freeze showed requests==2.7.0 on Debian VM whereas it is requests==2.18.3 on my machine. So I ran pip install --upgrade requests and AttributeError: __exit__ exception that broke the flow is gone

dhermes commented 7 years ago

@lukesneeringer @jonparrott We should probably do a release of storage that enforces that lower bound (code is already in master)?

theacodes commented 7 years ago

Seems reasonable

ashwathnrajan commented 7 years ago

Hello, I'm having the same problem here. It's really easy for me to reproduce, and I'm using a Google Cloud Compute VM running ubuntu 16.0.4. I don't think it's just a cloud-storage/requests versioning issue as I have the latest version of both installed, as well as the latest of google-cloud

I can reproduce the bug really simply.

from google.cloud import storage
client = storage.Client()
bucket = client.get_bucket(<STORAGE_BUCKET>)
blob = bucket.get_blob(<PATH_TO_VIDEO>)
blob.download_to_filename(<LOCAL_PATH_TO_VIDEO>)

will raise

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.5/dist-packages/google/cloud/storage/blob.py", line 482, in download_to_filename
    self.download_to_file(file_obj, client=client)
  File "/usr/local/lib/python3.5/dist-packages/google/cloud/storage/blob.py", line 464, in download_to_file
    self._do_download(transport, file_obj, download_url, headers)
  File "/usr/local/lib/python3.5/dist-packages/google/cloud/storage/blob.py", line 418, in _do_download
    download.consume(transport)
  File "/usr/local/lib/python3.5/dist-packages/google/resumable_media/requests/download.py", line 101, in consume
    self._write_to_stream(result)
  File "/usr/local/lib/python3.5/dist-packages/google/resumable_media/requests/download.py", line 62, in _write_to_
stream
    with response:
AttributeError: __exit__

Finally, here is my pip freeze (pip3 freeze): bleach==1.5.0 blinker==1.3 boto==2.38.0 cachetools==2.0.1 certifi==2017.7.27.1 chardet==2.3.0 cloud-init==0.7.9 command-not-found==0.3 configobj==5.0.6 cryptography==1.2.3 dill==0.2.7.1 future==0.16.0 gapic-google-cloud-datastore-v1==0.15.3 gapic-google-cloud-error-reporting-v1beta1==0.15.3 gapic-google-cloud-logging-v2==0.91.3 gapic-google-cloud-pubsub-v1==0.15.4 gapic-google-cloud-spanner-admin-database-v1==0.15.3 gapic-google-cloud-spanner-admin-instance-v1==0.15.3 gapic-google-cloud-spanner-v1==0.15.3 google-auth==1.0.2 google-cloud==0.27.0 google-cloud-bigquery==0.26.0 google-cloud-bigtable==0.26.0 google-cloud-core==0.26.0 google-cloud-datastore==1.2.0 google-cloud-dns==0.26.0 google-cloud-error-reporting==0.26.0 google-cloud-language==0.27.0 google-cloud-logging==1.2.0 google-cloud-monitoring==0.26.0 google-cloud-pubsub==0.27.0 google-cloud-resource-manager==0.26.0 google-cloud-runtimeconfig==0.26.0 google-cloud-spanner==0.26.0 google-cloud-speech==0.28.0 google-cloud-storage==1.3.2 google-cloud-translate==1.1.0 google-cloud-videointelligence==0.25.0 google-cloud-vision==0.26.0 google-compute-engine==2.4.1 google-gax==0.15.14 google-resumable-media==0.2.3 googleapis-common-protos==1.5.2 grpc-google-iam-v1==0.11.1 grpcio==1.4.0 html5lib==0.9999999 httplib2==0.10.3 idna==2.0 imutils==0.4.3 Jinja2==2.8 jsonpatch==1.10 jsonpointer==1.9 language-selector==0.1 Markdown==2.6.9 MarkupSafe==0.23 monotonic==1.3 numpy==1.13.1 oauth2client==3.0.0 oauthlib==1.0.3 ply==3.8 prettytable==0.7.2 proto-google-cloud-datastore-v1==0.90.4 proto-google-cloud-error-reporting-v1beta1==0.15.3 proto-google-cloud-logging-v2==0.91.3 proto-google-cloud-pubsub-v1==0.15.4 proto-google-cloud-spanner-admin-database-v1==0.15.3 proto-google-cloud-spanner-admin-instance-v1==0.15.3 proto-google-cloud-spanner-v1==0.15.3 protobuf==3.4.0 pyasn1==0.3.2 pyasn1-modules==0.0.11 pycurl==7.43.0 pygobject==3.20.0 PyJWT==1.3.0 pyserial==3.0.1 python-apt==1.1.0b1 python-debian==0.1.27 python-systemd==231 PyYAML==3.11 requests==2.9.1 rsa==3.4.2 six==1.10.0 ssh-import-id==5.5 tenacity==4.4.0 tensorflow==1.3.0 tensorflow-gpu==1.3.0 tensorflow-tensorboard==0.1.4 ufw==0.35 unattended-upgrades==0.1 urllib3==1.13.1 Werkzeug==0.12.2

dhermes commented 7 years ago

@ashwathnrajan You have requests==2.9.1, we need >= 2.18.0.

ashwathnrajan commented 7 years ago

meh. one day i'll learn to compare numbers. thanks, i'm sure this will fix it

naoko commented 7 years ago

I'm getting this error with the follow versions

requests==2.18.4
google-cloud-storage==1.3.2

this is my entire pip freeze output

+ pip freeze
alembic==0.9.5
arrow==0.8.0
beautifulsoup4==4.6.0
boto==2.48.0
bz2file==0.98
cachetools==2.0.1
certifi==2017.7.27.1
chardet==3.0.4
colander==1.3.3
configparser==3.5.0
cornice==2.4.0
cymem==1.31.2
cytoolz==0.8.2
datastuffpy==0.4.5
dill==0.2.7.1
docopt==0.6.2
elasticsearch==5.4.0
elasticsearch-dsl==5.3.0
email-reply-parser==0.5.9
fire==0.1.2
future==0.16.0
gapic-google-cloud-datastore-v1==0.15.3
gapic-google-cloud-error-reporting-v1beta1==0.15.3
gapic-google-cloud-logging-v2==0.91.3
gapic-google-cloud-pubsub-v1==0.15.4
gapic-google-cloud-spanner-admin-database-v1==0.15.3
gapic-google-cloud-spanner-admin-instance-v1==0.15.3
gapic-google-cloud-spanner-v1==0.15.3
gensim==3.0.0
google-api-python-client==1.6.4
google-auth==1.1.1
google-cloud==0.27.0
google-cloud-bigquery==0.26.0
google-cloud-bigtable==0.26.0
google-cloud-core==0.26.0
google-cloud-datastore==1.2.0
google-cloud-dns==0.26.0
google-cloud-error-reporting==0.26.0
google-cloud-language==0.27.0
google-cloud-logging==1.2.0
google-cloud-monitoring==0.26.0
google-cloud-pubsub==0.27.0
google-cloud-resource-manager==0.26.0
google-cloud-runtimeconfig==0.26.0
google-cloud-spanner==0.26.0
google-cloud-speech==0.28.0
google-cloud-storage==1.3.2
google-cloud-translate==1.1.0
google-cloud-videointelligence==0.25.0
google-cloud-vision==0.26.0
google-gax==0.15.15
google-resumable-media==0.2.3
googleapis-common-protos==1.5.3
grpc-google-iam-v1==0.11.4
grpcio==1.6.3
httplib2==0.10.3
hupper==1.0
idna==2.6
iso8601==0.1.11
lightgbm==2.0.7
Mako==1.0.7
MarkupSafe==1.0
marshmallow==2.13.6
monotonic==1.3
murmurhash==0.26.4
nextiva.nlp==0.2.91
nltk==3.2.5
numpy==1.13.3
oauth2client==3.0.0
pandas==0.20.3
PasteDeploy==1.5.2
pathlib==1.0.1
pkg-resources==0.0.0
plac==0.9.6
ply==3.8
preshed==1.0.0
proto-google-cloud-datastore-v1==0.90.4
proto-google-cloud-error-reporting-v1beta1==0.15.3
proto-google-cloud-logging-v2==0.91.3
proto-google-cloud-pubsub-v1==0.15.4
proto-google-cloud-spanner-admin-database-v1==0.15.3
proto-google-cloud-spanner-admin-instance-v1==0.15.3
proto-google-cloud-spanner-v1==0.15.3
protobuf==3.4.0
psycopg2==2.7.3.1
pyasn1==0.3.7
pyasn1-modules==0.1.4
pyramid==1.8.3
python-dateutil==2.6.1
python-editor==1.0.3
pytz==2017.2
raven==6.2.1
regex==2017.4.5
repoze.lru==0.7
requests==2.18.4
rsa==3.4.2
scikit-learn==0.18
scipy==0.19.1
simplejson==3.11.1
six==1.11.0
sklearn==0.0
smart-open==1.5.3
sox==1.3.0
spacy==1.7.5
SQLAlchemy==1.1.14
tenacity==4.4.0
termcolor==1.1.0
thinc==6.5.2
tinydb==3.6.0
toolz==0.8.2
tqdm==4.19.2
translationstring==1.3
ujson==1.35
uritemplate==3.0.0
urllib3==1.22
venusian==1.1.0
waitress==1.0.2
WebOb==1.7.3
wrapt==1.10.11
zope.deprecation==4.3.0
zope.interface==4.4.3

Almost identical virtualenv on my mac won't raise this error.

The error I'm getting is on debian

.env/lib/python3.5/site-packages/datastuffpy/storages/google/bucket.py:88: in download
    blob.download_to_file(file_obj)
.env/lib/python3.5/site-packages/google/cloud/storage/blob.py:464: in download_to_file
    self._do_download(transport, file_obj, download_url, headers)
.env/lib/python3.5/site-packages/google/cloud/storage/blob.py:418: in _do_download
    download.consume(transport)
.env/lib/python3.5/site-packages/google/resumable_media/requests/download.py:101: in consume
    self._write_to_stream(result)
.env/lib/python3.5/site-packages/google/resumable_media/requests/download.py:62: in _write_to_stream
    with response:
E   AttributeError: __exit__

Any advice would be appreciated.

dhermes commented 7 years ago

@naoko Run pip show requests to make sure that is the version of pip you think it is.

naoko commented 7 years ago

@dhermes , thank you for your quick response. It seems that version matches to what shows on pip freeze... :(

+ pip show requests
Name: requests
Version: 2.18.4
Summary: Python HTTP for Humans.
Home-page: http://python-requests.org
Author: Kenneth Reitz
Author-email: me@kennethreitz.org
License: Apache 2.0
Location: /opt/jenkins/workspace/ML-API-Test/.env/lib/python3.5/site-packages
Requires: urllib3, idna, chardet, certifi
dhermes commented 7 years ago

Is the Location (/opt/jenkins/workspace/ML-API-Test/.env/lib/python3.5/site-packages) the same as env/lib/python3.5/site-packages in your stacktrace? (The one in the stacktrace was a relative path, that's why I ask.)

On the breaking machine, can you do the following:

$ /opt/jenkins/workspace/ML-API-Test/.env/bin/python3.5
>>> import requests
>>> requests.__file__
'???'
>>> requests.__version__
'???'
naoko commented 7 years ago

For some reason, running that in Jenkins was so hard (looks like it strips single quote...)

$ command=( python3.5 -c $'import requests\nprint(requests.__file__)\nprint(requests.__version__)' )
$ "${command[@]}"

But the goal is to show that version of requests is really >= 2.18.0 or not on actual running environment. So right above where I run tests, I ran pip install -U requests which goes .env/lib/python3.5/site-packages and the exception is raised from .env/lib/python3.5/site-packages/google/resumable_media/requests/download.py so I have to think it does have latest version there... but I understand that I can only replicate this on one environment and I should not waste any more of your time. Thank you @dhermes for your time. I will think of other way to run jenkins job.

+ pip install -U requests
Requirement already up-to-date: requests in ./.env/lib/python3.5/site-packages
Requirement already up-to-date: idna<2.7,>=2.5 in ./.env/lib/python3.5/site-packages (from requests)
Requirement already up-to-date: certifi>=2017.4.17 in ./.env/lib/python3.5/site-packages (from requests)
Requirement already up-to-date: urllib3<1.23,>=1.21.1 in ./.env/lib/python3.5/site-packages (from requests)
Requirement already up-to-date: chardet<3.1.0,>=3.0.2 in ./.env/lib/python3.5/site-packages (from requests)
.env/lib/python3.5/site-packages/google/resumable_media/requests/download.py:62: in _write_to_stream
    with response:
E   AttributeError: __exit__
id0Sch commented 6 years ago

I don't know if it will help but I was able to solve the second issue (TypeError: request() got an unexpected keyword argument 'data' ) by initializing the client with my project_name like this

client = storage.Client(PROJECT_NAME) #without PROJECT_NAME it breaks
bucket = client.get_bucket(bucket_name)
blob = bucket.get_blob(filename)
return blob.download_as_string()
anthony-chaudhary commented 6 years ago

Hey there, I'm experiencing this issue too. google-cloud-storage == 1.6 requests == 2.18.4 Error

google\resumable_media\requests\download.py", line 117, in _write_to_stream
    with response:
AttributeError: __exit__

removing with statement in line 117 appears to monkey patch it

theacodes commented 6 years ago

@swirlingsand are you sure you have requests 2.18.4? Can you verify with import requests; print(requests.__version__)?

anthony-chaudhary commented 6 years ago

My apologies I think this was something with my conda setup. pip show requests yielded 2.18, however running that print statement showed 2.14 Thanks for help @jonparrott

theacodes commented 6 years ago

No worries!

On Thu, Feb 22, 2018, 5:37 PM Anthony Sarkis notifications@github.com wrote:

My apologies I think this was something with my conda setup. pip show requests yielded 2.18, however running that print statement showed 2.14 Thanks for help @jonparrott https://github.com/jonparrott

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GoogleCloudPlatform/google-cloud-python/issues/3736#issuecomment-367882112, or mute the thread https://github.com/notifications/unsubscribe-auth/AAPUc6G-tqGhrSySz7e2USQPsEdhjltfks5tXhZugaJpZM4OvlIi .

kparaju commented 6 years ago

I had to uninstall requests 2.18.4 (pip uninstall requests) and install 2.18.0 (pip install requests==2.18.0) to make it work

mmas commented 6 years ago

I downgraded google-cloud-storage from 1.8.0 to 1.6.0 too. So

google-cloud-storage==1.6.0
requests==2.18.0
jocieA commented 6 years ago

Hi all, any idea on how to make this work on Cloud Composer? I am migrating DAGs from our current environment to Cloud Composer however I get this error even when I install google-cloud-storage==1.6.0 and requests==2.18.0 as PyPI package.

[2018-08-20 09:32:09,128] {base_task_runner.py:98} INFO - Subtask: [2018-08-20 09:32:09,127] {bash_operator.py:101} INFO - from google.cloud import bigquery [2018-08-20 09:32:09,129] {base_task_runner.py:98} INFO - Subtask: [2018-08-20 09:32:09,129] {bash_operator.py:101} INFO - File "/usr/local/lib/python2.7/site-packages/google/cloud/bigquery/init.py", line 32, in [2018-08-20 09:32:09,130] {base_task_runner.py:98} INFO - Subtask: [2018-08-20 09:32:09,130] {bash_operator.py:101} INFO - from google.cloud.bigquery.client import Client [2018-08-20 09:32:09,131] {base_task_runner.py:98} INFO - Subtask: [2018-08-20 09:32:09,131] {bash_operator.py:101} INFO - File "/usr/local/lib/python2.7/site-packages/google/cloud/bigquery/client.py", line 20, in [2018-08-20 09:32:09,132] {base_task_runner.py:98} INFO - Subtask: [2018-08-20 09:32:09,132] {bash_operator.py:101} INFO - from google.cloud.bigquery.dataset import Dataset [2018-08-20 09:32:09,133] {base_task_runner.py:98} INFO - Subtask: [2018-08-20 09:32:09,133] {bash_operator.py:101} INFO - File "/usr/local/lib/python2.7/site-packages/google/cloud/bigquery/dataset.py", line 20, in [2018-08-20 09:32:09,134] {base_task_runner.py:98} INFO - Subtask: [2018-08-20 09:32:09,134] {bash_operator.py:101} INFO - from google.cloud.bigquery.table import Table [2018-08-20 09:32:09,140] {base_task_runner.py:98} INFO - Subtask: [2018-08-20 09:32:09,137] {bash_operator.py:101} INFO - File "/usr/local/lib/python2.7/site-packages/google/cloud/bigquery/table.py", line 27, in [2018-08-20 09:32:09,141] {base_task_runner.py:98} INFO - Subtask: [2018-08-20 09:32:09,138] {bash_operator.py:101} INFO - from google.cloud.exceptions import make_exception [2018-08-20 09:32:09,142] {base_task_runner.py:98} INFO - Subtask: [2018-08-20 09:32:09,138] {bash_operator.py:101} INFO - ImportError: cannot import name make_exception [2018-08-20 09:32:09,733] {base_task_runner.py:98} INFO - Subtask: [2018-08-20 09:32:09,730] {bash_operator.py:105} INFO - Command exited with return code 1

sonlac commented 6 years ago

Hi @jocieA, with your logs I am assuming that your problem is not in the module google-cloud-storage==1.6.0 but rather in the module google-cloud-bigquery. You can do a pip freeze to see your installed bigquery module. I am not working with Cloud Composer service but I am very familiar with Airflow DAGs, your task (using Airflow BashOperator) should work correctly "out-of-box". It should be working by using the command-line out of context of AIRFLOW/Cloud Composer. You just need to debug your own Airflow task/program.

Concretely, I saw in your logs, the error ImportError: cannot import name make_exception is in line 27 of the file google/cloud/bigquery/table.py. Your installed bigquery module could be incompatible?

Hope this helps. Thanks.

sjungwirth commented 6 years ago

For anyone else running into this issue with Google Cloud Composer, I ran into it after adding apache-beam[gcp]==2.6.0 to my Composer dependencies.

The issue here was that apache-beam is installing google-cloud-bigquery==0.25.0, which causes a bunch of other packages to be downgraded.

The fix for me was to explicitly state/install google-cloud-core>=0.28.0 google-cloud-bigquery>=1.5.0 AFTER apache-beam[gcp]==2.6.0 in requirements.txt file used to specify Composer dependencies: https://cloud.google.com/sdk/gcloud/reference/composer/environments/update

gcloud composer environments update ENVIRONMENT --location=LOCATION \
    --update-pypi-packages-from-file requirements.txt
jocieA commented 6 years ago

Hi @sjungwirth and @sonlac ,

Thanks for getting back on my post, these have been very useful. I recently tried running my pipeline on the latest version of cloud composer (1.1.0) and it worked without specifying versions of google cloud core, big query, cloud storage, requests etc. I think some improvements have been made to address this issue. Currently i'm looking forward to upgrade my existing environment to the new version.

harinuk224469 commented 5 years ago

Is this issue resolved ? what is the solution