QuivrHQ / quivr

Open-source RAG Framework for building GenAI Second Brains 🧠 Build productivity assistant (RAG) ⚡️🤖 Chat with your docs (PDF, CSV, ...) & apps using Langchain, GPT 3.5 / 4 turbo, Private, Anthropic, VertexAI, Ollama, LLMs, Groq that you can share with users ! Efficient retrieval augmented generation framework
https://quivr.com
Other
36.28k stars 3.53k forks source link

[Bug]: Connect Google Drive doesn't work (self-hosted Quivr) #3304

Open hex opened 1 week ago

hex commented 1 week ago

What happened?

Trying to connect to Google Drive on a self hosted Quivr returns Internal Server Error.

Relevant log output

backend-api  | INFO:     172.19.0.1:51808 - "GET /healthz HTTP/1.1" 200 OK
backend-api  | INFO:     172.19.0.1:51818 - "OPTIONS /sync/google/authorize?name=2kgyd6ez9ykme68vej7m3 HTTP/1.1" 200 OK
backend-api  | [DEBUG] quivr_api.modules.sync.controller.google_sync_routes [google_sync_routes.py:82]: Authorizing Google Drive sync for user: 06bc76d6-e496-488d-83cf-490628cd02ab, name : 2kgyd6ez9ykme68vej7m3
backend-api  | [INFO] quivr_api.modules.sync.controller.google_sync_routes [google_sync_routes.py:98]: Generated authorization URL: https://accounts.google.com/o/oauth2/auth?response_type=code&client_id=622670270568-l8erggglra0ngg77mkflitchih1m0eo1.apps.googleusercontent.com&redirect_uri=https%3A%2F%2Fquivr-api.erepubliklabs.com%2Fsync%2Fgoogle%2Foauth2callback&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive.metadata.readonly+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive.readonly+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fuserinfo.email+openid&state=user_id%3D06bc76d6-e496-488d-83cf-490628cd02ab%2C+name%3D2kgyd6ez9ykme68vej7m3&access_type=offline&include_granted_scopes=true&prompt=consent for user: 06bc76d6-e496-488d-83cf-490628cd02ab
backend-api  | [INFO] quivr_api.modules.sync.repository.sync_user [sync_user.py:47]: Creating sync user with input: user_id='06bc76d6-e496-488d-83cf-490628cd02ab' name='2kgyd6ez9ykme68vej7m3' email=None provider='Google' credentials={} state={'state': 'user_id=06bc76d6-e496-488d-83cf-490628cd02ab, name=2kgyd6ez9ykme68vej7m3'} additional_data={} status='SYNCED'
backend-api  | [INFO] quivr_api.modules.sync.repository.sync_user [sync_user.py:55]: Sync user created successfully: {'id': 34, 'name': '2kgyd6ez9ykme68vej7m3', 'provider': 'Google', 'state': {'state': 'user_id=06bc76d6-e496-488d-83cf-490628cd02ab, name=2kgyd6ez9ykme68vej7m3'}, 'credentials': {}, 'user_id': '06bc76d6-e496-488d-83cf-490628cd02ab', 'email': None, 'additional_data': {}, 'status': 'SYNCED'}
backend-api  | INFO:     172.19.0.1:51818 - "POST /sync/google/authorize?name=2kgyd6ez9ykme68vej7m3 HTTP/1.1" 200 OK
backend-api  | [INFO] quivr_api.modules.sync.controller.google_sync_routes [google_sync_routes.py:127]: State: user_id=06bc76d6-e496-488d-83cf-490628cd02ab, name=2kgyd6ez9ykme68vej7m3
backend-api  | [DEBUG] quivr_api.modules.sync.controller.google_sync_routes [google_sync_routes.py:131]: Handling OAuth2 callback for user: 06bc76d6-e496-488d-83cf-490628cd02ab with state: user_id=06bc76d6-e496-488d-83cf-490628cd02ab, name=2kgyd6ez9ykme68vej7m3
backend-api  | [INFO] quivr_api.modules.sync.repository.sync_user [sync_user.py:124]: Getting sync user by state: {'state': 'user_id=06bc76d6-e496-488d-83cf-490628cd02ab, name=2kgyd6ez9ykme68vej7m3'}
backend-api  | [INFO] quivr_api.modules.sync.repository.sync_user [sync_user.py:131]: Sync user found by state: {'id': 34, 'name': '2kgyd6ez9ykme68vej7m3', 'provider': 'Google', 'state': {'state': 'user_id=06bc76d6-e496-488d-83cf-490628cd02ab, name=2kgyd6ez9ykme68vej7m3'}, 'credentials': {}, 'user_id': '06bc76d6-e496-488d-83cf-490628cd02ab', 'email': None, 'additional_data': {}, 'status': 'SYNCED'}
backend-api  | [INFO] quivr_api.modules.sync.controller.google_sync_routes [google_sync_routes.py:135]: Retrieved sync user state: id=34 user_id=UUID('06bc76d6-e496-488d-83cf-490628cd02ab') name='2kgyd6ez9ykme68vej7m3' email=None provider='Google' credentials={} state={'state': 'user_id=06bc76d6-e496-488d-83cf-490628cd02ab, name=2kgyd6ez9ykme68vej7m3'} additional_data={} status='SYNCED'
backend-api  | INFO:     172.19.0.1:51820 - "GET /sync/google/oauth2callback?state=user_id%3D06bc76d6-e496-488d-83cf-490628cd02ab,%20name%3D2kgyd6ez9ykme68vej7m3&code=4/0AVG7fiTUpXwXEQBFNXv1D_tP9Y8bO_w2Fta9CWk-kNhR2xgjC-iL8NSGBSqu75YOv9r41Q&scope=email%20https://www.googleapis.com/auth/drive.metadata.readonly%20https://www.googleapis.com/auth/drive.readonly%20https://www.googleapis.com/auth/userinfo.email%20openid%20https://www.googleapis.com/auth/drive%20https://www.googleapis.com/auth/drive.metadata%20https://www.googleapis.com/auth/drive.photos.readonly%20https://www.googleapis.com/auth/drive.apps.readonly%20https://www.googleapis.com/auth/drive.appdata%20https://www.googleapis.com/auth/drive.scripts%20https://www.googleapis.com/auth/drive.file&authuser=1&hd=erepubliklabs.com&prompt=consent HTTP/1.1" 500 Internal Server Error
backend-api  | ERROR:    Exception in ASGI application
backend-api  | Traceback (most recent call last):
backend-api  |   File "/usr/local/lib/python3.11/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
backend-api  |     result = await app(  # type: ignore[func-returns-value]
backend-api  |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
backend-api  |   File "/usr/local/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in __call__
backend-api  |     return await self.app(scope, receive, send)
backend-api  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
backend-api  |   File "/usr/local/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in __call__
backend-api  |     await super().__call__(scope, receive, send)
backend-api  |   File "/usr/local/lib/python3.11/site-packages/starlette/applications.py", line 123, in __call__
backend-api  |     await self.middleware_stack(scope, receive, send)
backend-api  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in __call__
backend-api  |     raise exc
backend-api  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in __call__
backend-api  |     await self.app(scope, receive, _send)
backend-api  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/cors.py", line 85, in __call__
backend-api  |     await self.app(scope, receive, send)
backend-api  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 65, in __call__
backend-api  |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
backend-api  |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
backend-api  |     raise exc
backend-api  |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
backend-api  |     await app(scope, receive, sender)
backend-api  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 754, in __call__
backend-api  |     await self.middleware_stack(scope, receive, send)
backend-api  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 774, in app
backend-api  |     await route.handle(scope, receive, send)
backend-api  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 295, in handle
backend-api  |     await self.app(scope, receive, send)
backend-api  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 77, in app
backend-api  |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
backend-api  |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
backend-api  |     raise exc
backend-api  |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
backend-api  |     await app(scope, receive, sender)
backend-api  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 74, in app
backend-api  |     response = await f(request)
backend-api  |                ^^^^^^^^^^^^^^^^
backend-api  |   File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 278, in app
backend-api  |     raw_response = await run_endpoint_function(
backend-api  |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
backend-api  |   File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 193, in run_endpoint_function
backend-api  |     return await run_in_threadpool(dependant.call, **values)
backend-api  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
backend-api  |   File "/usr/local/lib/python3.11/site-packages/starlette/concurrency.py", line 42, in run_in_threadpool
backend-api  |     return await anyio.to_thread.run_sync(func, *args)
backend-api  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
backend-api  |   File "/usr/local/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync
backend-api  |     return await get_async_backend().run_sync_in_worker_thread(
backend-api  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
backend-api  |   File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread
backend-api  |     return await future
backend-api  |            ^^^^^^^^^^^^
backend-api  |   File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 859, in run
backend-api  |     result = context.run(func, *args)
backend-api  |              ^^^^^^^^^^^^^^^^^^^^^^^^
backend-api  |   File "/app/api/quivr_api/modules/sync/controller/google_sync_routes.py", line 152, in oauth2callback_google
backend-api  |     flow.fetch_token(authorization_response=str(request.url))
backend-api  |   File "/usr/local/lib/python3.11/site-packages/google_auth_oauthlib/flow.py", line 285, in fetch_token
backend-api  |     return self.oauth2session.fetch_token(self.client_config["token_uri"], **kwargs)
backend-api  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
backend-api  |   File "/usr/local/lib/python3.11/site-packages/requests_oauthlib/oauth2_session.py", line 406, in fetch_token
backend-api  |     self._client.parse_request_body_response(r.text, scope=self.scope)
backend-api  |   File "/usr/local/lib/python3.11/site-packages/oauthlib/oauth2/rfc6749/clients/base.py", line 427, in parse_request_body_response
backend-api  |     self.token = parse_token_response(body, scope=scope)
backend-api  |                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
backend-api  |   File "/usr/local/lib/python3.11/site-packages/oauthlib/oauth2/rfc6749/parameters.py", line 441, in parse_token_response
backend-api  |     validate_token_parameters(params)
backend-api  |   File "/usr/local/lib/python3.11/site-packages/oauthlib/oauth2/rfc6749/parameters.py", line 471, in validate_token_parameters
backend-api  |     raise w
backend-api  | Warning: Scope has changed from "https://www.googleapis.com/auth/drive.metadata.readonly https://www.googleapis.com/auth/userinfo.email openid https://www.googleapis.com/auth/drive.readonly" to "https://www.googleapis.com/auth/drive.metadata.readonly https://www.googleapis.com/auth/userinfo.email openid https://www.googleapis.com/auth/drive https://www.googleapis.com/auth/drive.photos.readonly https://www.googleapis.com/auth/drive.scripts https://www.googleapis.com/auth/drive.file https://www.googleapis.com/auth/drive.readonly https://www.googleapis.com/auth/drive.apps.readonly https://www.googleapis.com/auth/drive.metadata https://www.googleapis.com/auth/drive.appdata".

Twitter / LinkedIn details

No response

linear[bot] commented 1 week ago

CORE-222 [Bug]: Connect Google Drive doesn't work (self-hosted Quivr)

dosubot[bot] commented 1 week ago

Hey @hex! I'm here to help you with any bugs, questions, or becoming a contributor. Let's squash those bugs together! 🤖

The "Internal Server Error" when connecting Google Drive on a self-hosted Quivr could be due to several reasons:

  1. Environment Variables: Ensure that all necessary environment variables for Google OAuth, such as GOOGLE_CLIENT_ID, GOOGLE_CLIENT_SECRET, and others, are correctly set.

  2. State Parameter: The state parameter must be correctly managed and passed between requests. Ensure that it is valid and matches the expected state.

  3. User Validation: Check that the user_id from the state matches the current_user. Any mismatch will cause an error.

  4. Token Fetching: The flow.fetch_token method requires a valid authorization response. Ensure that the authorization response is correct and that the token fetching process is successful.

  5. Service Initialization: Verify that the credentials are valid and that the service can be initialized without issues.

Additionally, the log indicates a warning about the scope change. Ensure that the requested scopes are correctly configured in the Google Cloud Console and match the scopes used in your application [1].

To continue talking to Dosu, mention @dosu.