googleapis / google-cloud-python

Google Cloud Client Library for Python
https://googleapis.github.io/google-cloud-python/
Apache License 2.0
4.8k stars 1.52k forks source link

Support Cloud Client Libraries on Google App Engine standard #1893

Closed gpopovic closed 6 years ago

gpopovic commented 8 years ago

I'm able to deploy it but I keep getting this error: "DistributionNotFound: The 'gcloud' distribution was not found and is required by the application"

enzogro commented 7 years ago

Hi,I am having a problem using the library. I need to log on to the development server (dev_appserver) on my local machine, but I have not founda how to import google-cloud and always get the same error: ImportError: No module named google.cloud.datastore I've set it in the "lib" folder like all other third party libraries. Is there any way to use it for development? thanks This is the stacktrace:

ERROR    2016-10-26 11:24:19,880 wsgi.py:263] 
Traceback (most recent call last):
  File "/home/enzo/TRABAJO/gcloud-sdk/platform/google_appengine/google/appengine/runtime/wsgi.py", line 240, in Handle
    handler = _config_handle.add_wsgi_middleware(self._LoadHandler())
  File "/home/enzo/TRABAJO/gcloud-sdk/platform/google_appengine/google/appengine/runtime/wsgi.py", line 299, in _LoadHandler
    handler, path, err = LoadObject(self._handler)
  File "/home/enzo/TRABAJO/gcloud-sdk/platform/google_appengine/google/appengine/runtime/wsgi.py", line 85, in LoadObject
    obj = __import__(path[0])
  File "/home/enzo/TRABAJO/booking/sources/service/main.py", line 9, in <module>
    from blueprint.booking import booking
  File "/home/enzo/TRABAJO/booking/sources/service/blueprint/booking.py", line 11, in <module>
    from utilities.db import getConnector
  File "/home/enzo/TRABAJO/booking/sources/service/blueprint/utilities/db.py", line 3, in <module>
    from google.cloud import datastore
  File "/home/enzo/TRABAJO/gcloud-sdk/platform/google_appengine/google/appengine/tools/devappserver2/python/sandbox.py", line 999, in load_module
    raise ImportError('No module named %s' % fullname)
ImportError: No module named google.cloud.datastore
dhermes commented 7 years ago

The google namespace is the issue, have you done this using the GAE vendor tool, or manually?

enzogro commented 7 years ago

i did it with GAE vendor tool. This is my appengine_config.py

from google.appengine.ext import vendor
import os
# insert `lib` as a site directory so our `main` module can load
# third-party libraries, and override built-ins with newer
# versions.
vendor.add('lib')

if os.environ.get('SERVER_SOFTWARE', '').startswith('Development'):
    import imp
    import os.path
    from google.appengine.tools.devappserver2.python import sandbox

    sandbox._WHITE_LIST_C_MODULES += ['_ssl', '_socket']
    # Use the system socket.
    psocket = os.path.join(os.path.dirname(os.__file__), 'socket.py')
    imp.load_source('socket', psocket)
dhermes commented 7 years ago

@jonparrott Does vendor also patch namespaces?

theacodes commented 7 years ago

@dhermes vendor uses addsitedir which processes .pth files.

dhermes commented 7 years ago

I don't see any usage of addsitedir in the appengine_config.py above

theacodes commented 7 years ago

@dhermes source for vendor

erlichmen commented 7 years ago

I can repro the pwd issue when trying to init a pubsub client:

Relevant stacktrace:

  File "/callback.py", line 15, in callback
    client = pubsub.Client()
  File "/venv/lib/python2.7/site-packages/google/cloud/pubsub/client.py", line 74, in __init__
    super(Client, self).__init__(project, credentials, http)
  File "/venv/lib/python2.7/site-packages/google/cloud/client.py", line 162, in __init__
    _ClientProjectMixin.__init__(self, project=project)
  File "/venv/lib/python2.7/site-packages/google/cloud/client.py", line 118, in __init__
    project = self._determine_default(project)
  File "/venv/lib/python2.7/site-packages/google/cloud/client.py", line 131, in _determine_default
    return _determine_default_project(project)
  File "/venv/lib/python2.7/site-packages/google/cloud/_helpers.py", line 180, in _determine_default_project
    _, project = google.auth.default()
  File "/venv/lib/python2.7/site-packages/google/auth/_default.py", line 277, in default
    credentials, project_id = checker()
  File "/venv/lib/python2.7/site-packages/google/auth/_default.py", line 111, in _get_gcloud_sdk_credentials
    _cloud_sdk.get_application_default_credentials_path())
  File "/venv/lib/python2.7/site-packages/google/auth/_cloud_sdk.py", line 79, in get_application_default_credentials_path
    config_path = get_config_path()
  File "/venv/lib/python2.7/site-packages/google/auth/_cloud_sdk.py", line 56, in get_config_path
    os.path.expanduser('~'), '.config', _CONFIG_DIRECTORY)
  File "/venv/lib/python2.7/posixpath.py", line 261, in expanduser
    import pwd
  File "/Users/erlichmen/google-cloud-sdk/platform/google_appengine/google/appengine/tools/devappserver2/python/sandbox.py", line 963, in load_module
    raise ImportError('No module named %s' % fullname)
ImportError: No module named pwd
dhermes commented 7 years ago

@jonparrott This is from google-auth. Did we not account for this?

theacodes commented 7 years ago

@dhermes I didn't think of it because the os.path.expanduser issues should now be fixed both in production and with the cloud sdk. @erlichmen can you tell me which version of the App Engine SDK you're using?

erlichmen commented 7 years ago

@jonparrott

gcloud version

Google Cloud SDK 137.0.1 alpha 2016.01.12 app-engine-java 1.9.46 app-engine-python 1.9.40 app-engine-python-extras 1.9.40 beta 2016.01.12 bq 2.0.24 bq-nix 2.0.24 cloud-datastore-emulator 1.2.1 core 2016.12.08 core-nix 2016.11.07 gcd-emulator v1beta3-1.0.0 gcloud gsutil 4.22 gsutil-nix 4.18 kubectl kubectl-darwin-x86_64 1.4.6 pubsub-emulator 2016.08.19

Also:

gcloud components update All components are up to date.

theacodes commented 7 years ago

Apparently that fix did not yet make it into dev_appserver. I anticipate it'll be in the next release.

In the meantime @erlichmen, you can use a workaround in appengine_config.py:


import os

os.path.expanduser = lambda path: path
erlichmen commented 7 years ago

@jonparrott I solved it by putting:

env_variables:
  CLOUDSDK_CONFIG: /

in the app.yaml

theacodes commented 7 years ago

@erlichmen nice!

speedplane commented 7 years ago

I believe I just came across this bug and opened an SO question here. Is there any documentation on using the Google Cloud Library with the App Engine SDK? This may become a much more common use case as more APIs get transitioned to the REST interface.

daspecster commented 7 years ago

I'm not sure exactly how you're setting up your application, but this might be of some use. https://github.com/GoogleCloudPlatform/python-docs-samples/tree/master/appengine/flexible/datastore

speedplane commented 7 years ago

@daspecster Thanks, I've seen that documentation, but that isn't quite right. The documentation you sent is for setting up Google Cloud Library on Google App Engine Flexible without the compat runtimes. The link you sent does not use the App Engine SDKs at all, so there is no naming conflict issue.

Rather, I would like to setup Google Cloud Library on Google App Engine Standard. Or alternatively, to use Google Cloud Library on Google App Engine Flexible with the python-compat runtime. These both require using the Google App Engine SDK.

dhermes commented 7 years ago

@speedplane Have you seen https://cloud.google.com/appengine/docs/python/tools/using-libraries-python-27#installing_a_third-party_library?

dhermes commented 7 years ago

You may also need seem wrangling of google.__path__, but I think vendor handles that. @jonparrott (the author of vendor, who also works on GAE) may contradict me

speedplane commented 7 years ago

Yes, my SO question was answered, and it looks like vendor needs to be used. I have not yet tested it, but I will in the next few days. I wonder if these instructions can be added to the README.md?

dhermes commented 7 years ago

Good to hear

daspecster commented 7 years ago

It could be interesting to make a guide. But it might be better to put it in the examples that @jonparrott has?

chmoder commented 7 years ago

Hi I am trying to use google cloud python client to improve the performance of our BigQuery queries on our standard app engine app.

When I follow the documentation on https://cloud.google.com/appengine/docs/python/tools/using-libraries-python-27#installing_a_third-party_library to install https://googlecloudplatform.github.io/google-cloud-python/ I get the below error when trying to import bigquery.

Using this over Google APIs Python Client (https://developers.google.com/api-client-library/python/) could reduce our load times greatly so any help would be appreciated.

working directory

$ ls -l -1d app.yaml appengine_config.py lib
-rw-r--r--   1 tcross tcross   805 Jan  5 12:55 appengine_config.py
-rw-r--r--   1 tcross tcross  5325 Jan  5 09:58 app.yaml
drwxr-xr-x 100 tcross tcross 12288 Jan  5 12:51 lib

install google cloud python

pip install --upgrade -t lib google-cloud

appengine_config.py

from google.appengine.ext import vendor

vendor.add('lib')

Import library from google.cloud import bigquery

runtime error

ERROR    2017-01-05 18:54:11,797 wsgi.py:263] 
Traceback (most recent call last):
  File "/home/tcross/Downloads/google_appengine/google/appengine/runtime/wsgi.py", line 240, in Handle
    handler = _config_handle.add_wsgi_middleware(self._LoadHandler())
  File "/home/tcross/Downloads/google_appengine/google/appengine/runtime/wsgi.py", line 299, in _LoadHandler
    handler, path, err = LoadObject(self._handler)
  File "/home/tcross/Downloads/google_appengine/google/appengine/runtime/wsgi.py", line 85, in LoadObject
    obj = __import__(path[0])
  File "/home/tcross/development/csgapi/portals.py", line 6, in <module>
    from base import BaseHandler, EntityHandler, CollectionHandler
  File "/home/tcross/development/csgapi/base.py", line 12, in <module>
    from controllers.quote_history.quote_history_module import QuoteHistoryModule
  File "/home/tcross/development/csgapi/controllers/quote_history/quote_history_module.py", line 2, in <module>
    from google.cloud import bigquery
  File "/home/tcross/development/csgapi/lib/google/cloud/bigquery/__init__.py", line 26, in <module>
    from google.cloud.bigquery._helpers import ArrayQueryParameter
  File "/home/tcross/development/csgapi/lib/google/cloud/bigquery/_helpers.py", line 21, in <module>
    from google.cloud._helpers import _date_from_iso8601_date
  File "/home/tcross/Downloads/google_appengine/google/appengine/tools/devappserver2/python/sandbox.py", line 1001, in load_module
    raise ImportError('No module named %s' % fullname)
ImportError: No module named google.cloud._helpers
daspecster commented 7 years ago

@jonparrott @dhermes is the namespace getting overwritten somehow?

layoaster commented 7 years ago

@chmoder In production this is working. On the App Engine dev-server you can use virtualenv to avoid PATH/libs conflicts.

chmoder commented 7 years ago

Thanks for testing and the tip @layoaster

theacodes commented 7 years ago

Slight update: I'm working with @omaray to prepare some preliminary recommendations for using this library on App Engine standard. As several of you have discovered, there are many edge cases due to the way App Engine handles third-party libraries and the google namespace.

speedplane commented 7 years ago

@jonparrott One issue I ran into that you may want to consider: most of this library works through a REST interface (or at least an HTTP interface). App Engine Standard uses urlfetch for HTTP access which is not very performant. I got the datastore API working but it was incredibly slow. I believe this was because urlfetch is not as efficient as maintaining an open socket connection.

theacodes commented 7 years ago

@speedplane we're aware that urlfetch is not the ideal transport. We're looking into possibilities of alternative transports.

layoaster commented 7 years ago

@speedplane on App Engine Standard you can interact with the Datastore via the NDB library. Why do you need to use the (REST) Datastore API ?

speedplane commented 7 years ago

@layoaster I'm aware of the other APIs for accessing the datastore. I have modules that operate on App engine Standard and App Engine Flexible, and they use different APIs, making code maintenance and testing more painful. Would be nice if Google could provide a single API that works everywhere for the datastore.

matheuspatury commented 7 years ago

@jonparrott any updates on this? This library has a much better support for Storage and BigQuery than any other library available in the moment...

theacodes commented 7 years ago

Storage and BigQuery should work on app engine. Let me know if you have installation issues.

On Sun, Feb 12, 2017, 9:30 AM Matheus notifications@github.com wrote:

@jonparrott https://github.com/jonparrott any updates on this? This library has a much better support for Storage and BigQuery than any other library available in the moment...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GoogleCloudPlatform/google-cloud-python/issues/1893#issuecomment-279233903, or mute the thread https://github.com/notifications/unsubscribe-auth/AAPUcxCbsg1DaEgVPyCpJoBeNZWIeU9bks5rb0HAgaJpZM4I8yRt .

nilleb commented 7 years ago

Are the google (cloud) endpoints compatible with the cloud-sdk?

  File "/Users/nilleb/dev/project/app/server/libs/google_endpoints/endpoints/apiserving.py", line 74, in <module>
    from google.api.control import client as control_client
ImportError: No module named control

appengine_config.py:

    vendor.add('server/libs/google-cloud-sdk')
    vendor.add('server/libs/google_endpoints')
theacodes commented 7 years ago

@nilleb it's a differently library but it should work. Your call to vendor.add doesn't seem right- you should just point it at the top-level server/libs folder (or whichever folder you specify when running pip install -t {folder}

nilleb commented 7 years ago

@jonparrott the vendor.add() are right, even if they look strange.

In facts, I was trying to import google.cloud.spanner

  File "/Users/nilleb/dev/project/app/server/libs/google_endpoints/google/cloud/spanner/__init__.py", line 18, in <module>
    from google.cloud.spanner.client import Client
  File "/Users/nilleb/dev/project/app/server/libs/google_endpoints/google/cloud/spanner/client.py", line 28, in <module>
    from google.gax import INITIAL_PAGE
  File "/Users/nilleb/dev/project/app/server/libs/google_endpoints/google/gax/__init__.py", line 35, in <module>
    import multiprocessing as mp
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/__init__.py", line 65, in <module>
    from multiprocessing.util import SUBDEBUG, SUBWARNING
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/util.py", line 41, in <module>
    from subprocess import _args_from_interpreter_flags
ImportError: cannot import name _args_from_interpreter_flags

This is a known bug (Reference: https://github.com/googleapis/gax-python/issues/149)

If you try to get past this error with this dirty hack:

class DummyProcess(object):
    def start(self, target):
        target()

class DummyProcessing(ModuleType):
    def __init__(self):
        pass

    @staticmethod
    def Process(target):
        return DummyProcess(target)

    @staticmethod
    def Queue():
        return Queue()

sys.modules['multiprocessing'] = DummyProcessing

You just fail with another error

  File "/Users/nilleb/dev/project/app/server/libs/google_endpoints/dill/__init__.py", line 27, in <module>
    from .dill import dump, dumps, load, loads, dump_session, load_session, \
  File "/Users/nilleb/dev/project/app/server/libs/google_endpoints/dill/dill.py", line 68, in <module>
    import __main__ as _main_module
ImportError: Cannot re-init internal module __main__

You can solve this one with an easy

sys.modules['dill'] = pickle

So, if you've installed your google-cloud-spanner module with a command line similar to

pip install --upgrade -t . google-cloud google-cloud-core==0.23.0 google-cloud-spanner

You will get a working google-cloud-spanner in a GAE instance :-)

theacodes commented 7 years ago

https://github.com/googleapis/gax-python/issues/149

@nilleb are you staying that spanner works after this?

nilleb commented 7 years ago

@jonparrott using the hack above, on my local development server, I am able to create databases and insert data. Not tested in production (just a POC).

theacodes commented 7 years ago

@nilleb I see, it definitely won't work in production. I'm surprised that it works locally.

nilleb commented 7 years ago

@jonparrott do you have an idea of when it will be ready for production?

Le 17 févr. 2017 6:47 PM, "Jon Wayne Parrott" notifications@github.com a écrit :

@nilleb https://github.com/nilleb I see, it definitely won't work in production. I'm surprised that it works locally.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GoogleCloudPlatform/google-cloud-python/issues/1893#issuecomment-280718567, or mute the thread https://github.com/notifications/unsubscribe-auth/ADVmx3yqRdWORNj4Twmqr-zivxsuy_Ryks5rdd0egaJpZM4I8yRt .

theacodes commented 7 years ago

I can't speak to timelines, but it's on our radar.

tbrent commented 7 years ago

Pretty excited to use spanner in production but lack of client library support in standard appengine env is blocking.

mentat commented 7 years ago

I'm having a similar problem using the cloud vision api in GAE standard:

Traceback (most recent call last):
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/runtime/wsgi.py", line 240, in Handle
    handler = _config_handle.add_wsgi_middleware(self._LoadHandler())
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/runtime/wsgi.py", line 299, in _LoadHandler
    handler, path, err = LoadObject(self._handler)
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/runtime/wsgi.py", line 96, in LoadObject
    __import__(cumulative_path)
  File "/base/data/home/apps/s~SOMETHING/20170405t102043.400345193010976560/core/main.py", line 9, in <module>
    from api import * # noqa
  File "/base/data/home/apps/s~SOMETHING/20170405t102043.400345193010976560/core/api.py", line 8, in <module>
    from google.cloud import vision
  File "/base/data/home/apps/s~SOMETHING/20170405t102043.400345193010976560/vendor/google/cloud/vision/__init__.py", line 21, in <module>
    from google.cloud.vision.client import Client
  File "/base/data/home/apps/s~SOMETHING/20170405t102043.400345193010976560/vendor/google/cloud/vision/client.py", line 22, in <module>
    from google.cloud.vision._gax import _GAPICVisionAPI
  File "/base/data/home/apps/s~SOMETHING/20170405t102043.400345193010976560/vendor/google/cloud/vision/_gax.py", line 17, in <module>
    from google.cloud.gapic.vision.v1 import image_annotator_client
  File "/base/data/home/apps/s~SOMETHING/20170405t102043.400345193010976560/vendor/google/cloud/gapic/vision/v1/image_annotator_client.py", line 31, in <module>
    from google.gax import api_callable
  File "/base/data/home/apps/s~SOMETHING/20170405t102043.400345193010976560/vendor/google/gax/__init__.py", line 39, in <module>
    from grpc import RpcError, StatusCode
  File "/base/data/home/apps/s~SOMETHING/20170405t102043.400345193010976560/vendor/grpc/__init__.py", line 37, in <module>
    from grpc._cython import cygrpc as _cygrpc
ImportError: dynamic module does not define init function (initcygrpc)
theacodes commented 7 years ago

@lukesneeringer this is still an open (and important) issue.

anarasimham commented 7 years ago

Thank you for this thread. I've been trying for a few days to execute a batch job on Google Dataflow using the Python API with Beam. It kept giving me this error similar to above:

  (3e59e0af2f2af432): Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 705, in run
    self._load_main_session(self.local_staging_directory)
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 445, in _load_main_session
    pickler.load_session(session_file)
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/internal/pickler.py", line 247, in load_session
    return dill.load_session(file_path)
  File "/usr/local/lib/python2.7/dist-packages/dill/dill.py", line 363, in load_session
    module = unpickler.load()
  File "/usr/lib/python2.7/pickle.py", line 858, in load
    dispatch[key](self)
  File "/usr/lib/python2.7/pickle.py", line 1133, in load_reduce
    value = func(*args)
  File "/usr/local/lib/python2.7/dist-packages/dill/dill.py", line 767, in _import_module
    return getattr(__import__(module, None, None, [obj]), obj)
  File "/usr/local/lib/python2.7/dist-packages/google/cloud/storage/__init__.py", line 38, in <module>
    from google.cloud.storage.blob import Blob
  File "/usr/local/lib/python2.7/dist-packages/google/cloud/storage/blob.py", line 42, in <module>
    from google.cloud.iam import Policy
ImportError: No module named iam

Setting a hard version as someone noted earlier in the thread in my setup.py for google-cloud-core did the trick: ...,'google-cloud-core==0.24.1',...

xiuliren commented 7 years ago

I got a similar error. python 2.7

In [1]: from neuroglancer.pipeline.volumes import gcloudvolume                                                                                                 
---------------------------------------------------------------------------
DistributionNotFound                      Traceback (most recent call last)    
<ipython-input-1-2fe83029a6c1> in <module>()                                   
----> 1 from neuroglancer.pipeline.volumes import gcloudvolume

/usr/people/jingpeng/workspace/neuroglancer/python/neuroglancer/pipeline/__init__.py in <module>()
      1 from neuroglancer._mesher import Mesher                                                                                                                
----> 2 from storage import Storage                                            
      3 from precomputed import Precomputed, EmptyVolumeException              
      4 from task_queue import TaskQueue, RegisteredTask                       
      5 from tasks import *                                                    

/usr/people/jingpeng/workspace/neuroglancer/python/neuroglancer/pipeline/storage.py in <module>()
      8 from glob import glob                                                                                                                                  
      9 import google.cloud.exceptions                                                                                                                         
---> 10 from google.cloud.storage import Client                                
     11 import boto                                                                                                                                            
     12 from boto.s3.connection import S3Connection

/usr/people/jingpeng/workspace/neuroglancer/python/jingpengw/lib/python2.7/site-packages/google/cloud/storage/__init__.py in <module>()
     33                                                                                                                                                        
     34 from pkg_resources import get_distribution                                                                                                             
---> 35 __version__ = get_distribution('google-cloud-storage').version
     36                                                                        
     37 from google.cloud.storage.batch import Batch

/usr/people/jingpeng/lib/anaconda2/lib/python2.7/site-packages/setuptools-27.2.0-py2.7.egg/pkg_resources/__init__.py in get_distribution(dist)
    555         dist = Requirement.parse(dist)                                                                                                                     556     if isinstance(dist, Requirement):
--> 557         dist = get_provider(dist)                                      
    558     if not isinstance(dist, Distribution):                             
    559         raise TypeError("Expected string, Requirement, or Distribution", dist)

/usr/people/jingpeng/lib/anaconda2/lib/python2.7/site-packages/setuptools-27.2.0-py2.7.egg/pkg_resources/__init__.py in get_provider(moduleOrReq)
    429     """Return an IResourceProvider for the named module or requirement"""
    430     if isinstance(moduleOrReq, Requirement):                                                                                                           
--> 431         return working_set.find(moduleOrReq) or require(str(moduleOrReq))[0]
    432     try:                       
433         module = sys.modules[moduleOrReq]                              

/usr/people/jingpeng/lib/anaconda2/lib/python2.7/site-packages/setuptools-27.2.0-py2.7.egg/pkg_resources/__init__.py in require(self, *requirements)
    966         included, even if they were already activated in this working set.
    967         """
--> 968         needed = self.resolve(parse_requirements(requirements))        
    969            
    970         for dist in needed:    

/usr/people/jingpeng/lib/anaconda2/lib/python2.7/site-packages/setuptools-27.2.0-py2.7.egg/pkg_resources/__init__.py in resolve(self, requirements, env, installer, replace_conflicting)
    852                     if dist is None:                                   
    853                         requirers = required_by.get(req, None)         
--> 854                         raise DistributionNotFound(req, requirers)     
    855                 to_activate.append(dist)                               
    856             if dist not in req:

DistributionNotFound: The 'google-cloud-storage' distribution was not found and is required by the application
dhermes commented 7 years ago

@jingpengw That typically means the package metadata is missing, which is a sign that not all files were copied.

xiuliren commented 7 years ago

@dhermes you mean the metadata of google-cloud-python or my local neuroglancer?

dhermes commented 7 years ago

In this instance I mean the metadata of google-cloud-storage (which is the package referenced in your stacktrace).

xiuliren commented 7 years ago

@dhermes thanks. I downloaded the code and reinstalled with python setup install. it works fine now.