googleapis / google-cloud-python

Google Cloud Client Library for Python
https://googleapis.github.io/google-cloud-python/
Apache License 2.0
4.84k stars 1.53k forks source link

Async discoveryengine_v1 client not working #13256

Open francescov1 opened 2 weeks ago

francescov1 commented 2 weeks ago

Determine this is the right repository

Summary of the issue

Context I am trying to use the DocumentServiceAsyncClient but am getting event loop errors. I believe the library is creating separate event loops.

Expected Behavior: I can use the async client without it creating event loops.

Actual Behavior: When I call document_client.import_documents(), i get

RuntimeError: Task <Task pending name='Task-3' coro=<_wrap_awaitable() running at /Users/francescovirga/.pyenv/versions/3.11.7/lib/python3.11/asyncio/tasks.py:694> cb=[_release_waiter(<Future pendi...ask_wakeup()]>)() at /Users/francescovirga/.pyenv/versions/3.11.7/lib/python3.11/asyncio/tasks.py:431]> got Future <Task pending name='Task-2' coro=<UnaryUnaryCall._invoke() running at /Users/francescovirga/.local/share/virtualenvs/lighthouse-ai-chat-JRzDFnJw/lib/python3.11/site-packages/grpc/aio/_call.py:577>> attached to a different loop

API client name and version

No response

Reproduction steps: code

file: main.py

from google.cloud.discoveryengine_v1 import (
    DocumentServiceAsyncClient,
    ImportDocumentsRequest,
    GcsSource,
)
from src.config import Config
from google.oauth2 import service_account

LOCATION = "global"
DATA_STORE_ID = "document-search-datastore"

credentials = service_account.Credentials.from_service_account_file(
    "vertex-ai-service-account-key.json"
)

document_client = DocumentServiceAsyncClient(credentials=credentials)
DOCUMENT_CLIENT_BRANCH_PATH = document_client.branch_path(
    project=Config.GCP_PROJECT_ID,
    location=LOCATION,
    data_store=DATA_STORE_ID,
    branch="default_branch",
)

async def import_documents(gcs_uri: str):
    """
    gcs_uri should be the file path to the jsonl metadata file in GCS
    """

    request = ImportDocumentsRequest(
        parent=DOCUMENT_CLIENT_BRANCH_PATH,
        gcs_source=GcsSource(
            input_uris=[gcs_uri],
            # Unstructured documents with custom JSONL metadata
            data_schema="custom",
        ),
        reconciliation_mode=ImportDocumentsRequest.ReconciliationMode.INCREMENTAL,
    )

    await document_client.import_documents(request=request)

if __name__ == "__main__":
    import asyncio

    async def main():
        await import_documents("gs://sec_documents/development/metadata.jsonl")

    asyncio.run(main())

Reproduction steps: supporting files

No response

Reproduction steps: actual results

terminal output

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/francescovirga/Projects/lighthouse-ai-chat/src/clients/google_discovery_engine/client.py", line 58, in <module>
    asyncio.run(main())
  File "/Users/francescovirga/.pyenv/versions/3.11.7/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/Users/francescovirga/.pyenv/versions/3.11.7/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/francescovirga/.pyenv/versions/3.11.7/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/Users/francescovirga/Projects/lighthouse-ai-chat/src/clients/google_discovery_engine/client.py", line 56, in main
    await import_documents("gs://sec_documents/development/metadata.jsonl")
  File "/Users/francescovirga/Projects/lighthouse-ai-chat/src/clients/google_discovery_engine/client.py", line 41, in import_documents
    operation = await document_client.import_documents(request=request)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/francescovirga/.local/share/virtualenvs/lighthouse-ai-chat-JRzDFnJw/lib/python3.11/site-packages/google/cloud/discoveryengine_v1/services/document_service/async_client.py", line 1010, in import_documents
    response = await rpc(
               ^^^^^^^^^^
  File "/Users/francescovirga/.local/share/virtualenvs/lighthouse-ai-chat-JRzDFnJw/lib/python3.11/site-packages/google/api_core/retry_async.py", line 223, in retry_wrapped_func
    return await retry_target(
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/francescovirga/.local/share/virtualenvs/lighthouse-ai-chat-JRzDFnJw/lib/python3.11/site-packages/google/api_core/retry_async.py", line 121, in retry_target
    return await asyncio.wait_for(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/francescovirga/.pyenv/versions/3.11.7/lib/python3.11/asyncio/tasks.py", line 489, in wait_for
    return fut.result()
           ^^^^^^^^^^^^
  File "/Users/francescovirga/.pyenv/versions/3.11.7/lib/python3.11/asyncio/tasks.py", line 694, in _wrap_awaitable
    return (yield from awaitable.__await__())
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/francescovirga/.local/share/virtualenvs/lighthouse-ai-chat-JRzDFnJw/lib/python3.11/site-packages/google/api_core/grpc_helpers_async.py", line 85, in __await__
    response = yield from self._call.__await__()
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/francescovirga/.local/share/virtualenvs/lighthouse-ai-chat-JRzDFnJw/lib/python3.11/site-packages/grpc/aio/_call.py", line 308, in __await__
    response = yield from self._call_response
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Task <Task pending name='Task-3' coro=<_wrap_awaitable() running at /Users/francescovirga/.pyenv/versions/3.11.7/lib/python3.11/asyncio/tasks.py:694> cb=[_release_waiter(<Future pendi...ask_wakeup()]>)() at /Users/francescovirga/.pyenv/versions/3.11.7/lib/python3.11/asyncio/tasks.py:431]> got Future <Task pending name='Task-2' coro=<UnaryUnaryCall._invoke() running at /Users/francescovirga/.local/share/virtualenvs/lighthouse-ai-chat-JRzDFnJw/lib/python3.11/site-packages/grpc/aio/_call.py:577>> attached to a different loop

Reproduction steps: expected results

No error

OS & version + platform

MacOS 14.4

Python environment

Python 3.11.7

Python dependencies

Package Version


absl-py 2.1.0 aiofiles 23.2.1 aiohappyeyeballs 2.4.3 aiohttp 3.10.10 aiomysql 0.2.0 aiosignal 1.3.1 alembic 1.14.0 altair 5.4.1 aniso8601 9.0.1 annotated-types 0.7.0 anyio 4.6.2.post1 appnope 0.1.4 asn1crypto 1.5.1 astroid 3.3.5 asttokens 2.4.1 async-timeout 4.0.3 asyncmy 0.2.9 asyncpg 0.30.0 attrs 24.2.0 azure-core 1.32.0 azure-storage-blob 12.23.1 backoff 2.2.1 beautifulsoup4 4.12.3 black 24.10.0 blinker 1.8.2 boto3 1.35.43 botocore 1.35.43 bs4 0.0.2 CacheControl 0.14.0 cachetools 5.5.0 certifi 2024.8.30 cffi 1.17.1 cfgv 3.4.0 chardet 5.2.0 charset-normalizer 3.4.0 chex 0.1.87 click 8.1.7 cloudpickle 2.2.1 cloudscraper 1.2.71 cohere 5.11.3 comm 0.2.2 contourpy 1.3.0 cryptography 43.0.3 cycler 0.12.1 databricks-sdk 0.35.0 datasets 3.1.0 debugpy 1.8.6 decorator 5.1.1 Deprecated 1.2.14 dill 0.3.8 distlib 0.3.8 distro 1.9.0 dnspython 2.7.0 docker 7.1.0 docstring_parser 0.16 etils 1.10.0 eval_type_backport 0.2.0 execnet 2.1.1 executing 2.1.0 fake-useragent 1.5.1 fastapi 0.115.4 fastavro 1.9.7 filelock 3.16.1 firebase-admin 6.5.0 Flask 3.0.3 flax 0.10.1 fonttools 4.54.1 frozenlist 1.5.0 fsspec 2024.9.0 gcloud-aio-auth 5.3.2 gcloud-aio-storage 9.3.0 gitdb 4.0.11 GitPython 3.1.43 google-api-core 2.15.0 google-api-python-client 2.114.0 google-auth 2.26.2 google-auth-httplib2 0.2.0 google-cloud-aiplatform 1.71.1 google-cloud-bigquery 3.18.0 google-cloud-core 2.4.1 google-cloud-discoveryengine 0.13.3 google-cloud-firestore 2.14.0 google-cloud-resource-manager 1.12.3 google-cloud-storage 2.14.0 google-crc32c 1.5.0 google-pasta 0.2.0 google-resumable-media 2.7.0 googleapis-common-protos 1.62.0 graphene 3.3 graphql-core 3.2.5 graphql-relay 3.2.0 greenlet 3.1.1 grpc-google-iam-v1 0.13.0 grpcio 1.67.1 grpcio-status 1.62.3 grpcio-tools 1.62.3 gunicorn 23.0.0 h11 0.14.0 h2 4.1.0 hpack 4.0.0 html2text 2024.2.26 httpcore 1.0.6 httplib2 0.22.0 httpx 0.27.2 httpx-sse 0.4.0 huggingface-hub 0.26.2 humanize 4.11.0 hyperframe 6.0.1 identify 2.6.1 idna 3.10 importlib-metadata 6.11.0 importlib_resources 6.4.5 inflection 0.5.1 iniconfig 2.0.0 ipykernel 6.29.5 ipython 8.28.0 isodate 0.7.2 isort 5.13.2 itsdangerous 2.2.0 jax 0.4.35 jaxlib 0.4.35 jedi 0.19.1 Jinja2 3.1.4 jiter 0.7.0 jmespath 1.0.1 joblib 1.4.2 jsonpatch 1.33 jsonpointer 3.0.0 jsonschema 4.23.0 jsonschema-specifications 2024.10.1 jupyter_client 8.6.3 jupyter_core 5.7.2 kiwisolver 1.4.7 langchain 0.3.7 langchain-core 0.3.15 langchain-text-splitters 0.3.2 langsmith 0.1.140 lxml 5.3.0 Mako 1.3.6 Markdown 3.7 markdown-it-py 3.0.0 MarkupSafe 3.0.2 matplotlib 3.9.2 matplotlib-inline 0.1.7 mccabe 0.7.0 mdurl 0.1.2 ml_dtypes 0.5.0 mlflow 2.17.0 mlflow-skinny 2.17.0 mock 4.0.3 more-itertools 10.5.0 msgpack 1.1.0 multidict 6.1.0 multiprocess 0.70.16 mypy 1.11.2 mypy-extensions 1.0.0 mysql-connector-python 9.1.0 narwhals 1.13.2 Nasdaq-Data-Link 1.0.4 nest-asyncio 1.6.0 nodeenv 1.9.1 numpy 1.26.4 openai 1.54.2 opentelemetry-api 1.27.0 opentelemetry-sdk 1.27.0 opentelemetry-semantic-conventions 0.48b0 opt_einsum 3.4.0 optax 0.2.3 orbax-checkpoint 0.8.0 orjson 3.10.11 outcome 1.3.0.post0 packaging 24.1 pandas 2.2.3 pandas_ta 0.3.14b0 parameterized 0.9.0 parso 0.8.4 pathos 0.3.2 pathspec 0.12.1 pendulum 3.0.0 pexpect 4.9.0 pg8000 1.31.2 pgvector 0.3.6 pillow 10.4.0 pip 24.2 platformdirs 4.3.6 playwright 1.48.0 pluggy 1.5.0 portalocker 2.10.1 pox 0.3.5 ppft 1.7.6.9 pre_commit 4.0.1 prometheus_client 0.21.0 prompt_toolkit 3.0.48 propcache 0.2.0 proto-plus 1.25.0 protobuf 4.25.5 psutil 6.0.0 ptyprocess 0.7.0 pure_eval 0.2.3 pyarrow 18.0.0 pyasn1 0.6.1 pyasn1_modules 0.4.0 pycparser 2.22 pydantic 2.9.2 pydantic_core 2.23.4 pydeck 0.9.1 pydeps 1.12.20 pyee 12.0.0 Pygments 2.18.0 PyJWT 2.9.0 pylint 3.3.1 pymongo 4.10.1 PyMySQL 1.1.1 pyparsing 3.2.0 PySocks 1.7.1 pytest 8.3.3 pytest-asyncio 0.24.0 pytest-rerunfailures 14.0 pytest-xdist 3.6.1 python-dateutil 2.9.0.post0 python-dotenv 1.0.1 python-slugify 8.0.4 pytz 2024.2 PyYAML 6.0.2 pyzmq 26.2.0 qdrant-client 1.12.1 RapidFuzz 3.10.1 redis 5.2.0 referencing 0.35.1 regex 2024.9.11 requests 2.32.3 requests-toolbelt 1.0.0 rich 13.9.4 rpds-py 0.21.0 rsa 4.9 s3transfer 0.10.3 safetensors 0.4.5 sagemaker 2.232.2 sagemaker-core 1.0.10 sagemaker-mlflow 0.1.0 schema 0.7.7 scikit-learn 1.5.2 scipy 1.14.1 scramp 1.4.5 selenium 4.26.1 sentry-sdk 2.18.0 setuptools 75.3.0 shapely 2.0.6 shellingham 1.5.4 six 1.16.0 smdebug-rulesconfig 1.0.1 smmap 5.0.1 sniffio 1.3.1 sortedcontainers 2.4.0 soupsieve 2.6 SQLAlchemy 2.0.36 sqlparse 0.5.1 stack-data 0.6.3 starlette 0.41.2 stdlib-list 0.10.0 streamlit 1.39.0 tabulate 0.9.0 talipp 2.4.0 tblib 3.0.0 tenacity 9.0.0 tensorstore 0.1.67 text-unidecode 1.3 threadpoolctl 3.5.0 tiktoken 0.8.0 time-machine 2.16.0 together 1.3.1 tokenizers 0.20.3 toml 0.10.2 tomlkit 0.13.2 toolz 1.0.0 tornado 6.4.1 tqdm 4.67.0 traitlets 5.14.3 transformers 4.46.2 trio 0.27.0 trio-websocket 0.11.1 typer 0.12.5 types-cffi 1.16.0.20240331 types-pyOpenSSL 24.1.0.20240722 types-python-slugify 8.0.2.20240310 types-pytz 2024.2.0.20241003 types-redis 4.6.0.20241004 types-requests 2.32.0.20241016 types-retry 0.9.9.4 types-setuptools 75.1.0.20240917 typing_extensions 4.12.2 tzdata 2024.2 uritemplate 4.1.1 urllib3 2.2.3 uuid 1.30 uvicorn 0.32.0 virtualenv 20.26.6 watchdog 6.0.0 wcwidth 0.2.13 webdriver-manager 4.0.2 websocket-client 1.8.0 Werkzeug 3.0.4 wheel 0.44.0 wrapt 1.16.0 wsproto 1.2.0 xxhash 3.5.0 yarl 1.17.1 zipp 3.20.2

Additional context

No response

francescov1 commented 2 days ago

Any news on this?