QuivrHQ / quivr

Open-source RAG Framework for building GenAI Second Brains 🧠 Build productivity assistant (RAG) ⚡️🤖 Chat with your docs (PDF, CSV, ...) & apps using Langchain, GPT 3.5 / 4 turbo, Private, Anthropic, VertexAI, Ollama, LLMs, Groq that you can share with users ! Efficient retrieval augmented generation framework
https://quivr.com
Other
35.71k stars 3.46k forks source link

[Bug]: #2366

Closed wangshuai-wuhan closed 1 month ago

wangshuai-wuhan commented 5 months ago

What happened?

A bug happened!

Relevant log output

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/celery/app/trace.py", line 477, in trace_task
    R = retval = fun(*args, **kwargs)
                 ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/celery/app/trace.py", line 760, in __protected_call__
    return self.run(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/code/celery_worker.py", line 181, in process_integration_brain_sync_user_brain
    notion_connector.poll()
  File "/code/modules/brain/integrations/Notion/Notion_connector.py", line 357, in poll
    db_res = self._search_notion(query_dict)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/code/modules/brain/integrations/Notion/Notion_connector.py", line 99, in _search_notion
    res.raise_for_status()
  File "/usr/local/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://api.notion.com/v1/search
Clock   64
Root    <Task: celery_worker.process_integration_brain_sync(830f71d0-d755-47a7-a1b0-9c65796c7549) SUCCESS clock:60>

Twitter / LinkedIn details

No response

dosubot[bot] commented 5 months ago

Hey @wangshuai-wuhan, nice to see you around here again! Hope you're doing well.

The "401 Client Error: Unauthorized" error you're encountering when the Notion connector in quivr tries to access the Notion API typically indicates that the API request was made with invalid or expired credentials. In the context of the provided Notion_connector.py script, this error is likely due to an issue with the notion_integration_token used in the API request headers.

The _headers method constructs the headers for the API request, including the Authorization header, which uses the notion_integration_token:

def _headers(self) -> dict[str, str]:
    """Get the headers for the Notion API"""
    return {
        "Authorization": f'Bearer {self.credentials["notion_integration_token"]}',
        "Content-Type": "application/json",
        "Notion-Version": "2022-06-28",
    }

If you're receiving a 401 error, it suggests that the notion_integration_token either hasn't been set correctly, has expired, or does not have the necessary permissions to perform the requested operations on the Notion API.

To resolve this issue, ensure that:

  1. The notion_integration_token is correctly set in the integration details settings and is being correctly loaded by the _load_credentials method.
  2. The notion_integration_token is valid and has not expired. You may need to generate a new integration token from the Notion Integration settings page.
  3. The Notion integration (associated with the notion_integration_token) has been granted access to the relevant Notion pages or databases that your application is attempting to access.

The process for setting the notion_integration_token in the integration details settings in Quivr involves the following steps, as derived from the provided context in the Notion_connector.py file:

  1. Initialization of NotionConnector: When an instance of NotionConnector is created, it calls its __init__ method with brain_id and user_id as parameters. This method, in turn, calls the _load_credentials method.

  2. Loading Credentials: The _load_credentials method is responsible for loading the Notion credentials. It retrieves the integration details for the given brain_id and user_id by calling self.get_integration_brain(self.brain_id, self.user_id). This retrieves an IntegrationEntity object which contains the integration details, including settings.

  3. Setting the Notion Integration Token: Within the _load_credentials method, if the credentials attribute of the NotionConnector instance is None, the method proceeds to set the credentials dictionary. It does this by accessing the settings attribute of the integration_details object and retrieving the value associated with the key notion_integration_token. If the token is not found in the settings, an empty string is used as a default value. This token is then stored in the credentials dictionary under the key notion_integration_token.

  4. Updating Integration Details: After setting the credentials dictionary, the _load_credentials method calls self.update_integration_brain(self.brain_id, self.user_id, self.integration_details) to update the integration details in the database or wherever the integration details are stored. This ensures that the notion_integration_token is saved as part of the integration settings.

  5. Using the Token: The notion_integration_token is then used in the _headers method to create the authorization header required for making API requests to Notion. The header is constructed as Authorization: Bearer {notion_integration_token}, where {notion_integration_token} is replaced with the actual token value.

In summary, the notion_integration_token is set in the integration details settings by loading the integration details for a specific brain_id and user_id, retrieving or setting the token in the settings, and then updating the integration details to save the changes. This token is then used to authenticate API requests to Notion.

Unfortunately, I wasn't able to find the exact process of generating a new notion_integration_token from the Notion Integration settings page within the repository. You may need to refer to Notion's official documentation or support for this.

As for the get_integration_brain and update_integration_brain methods in the Notion_connector.py script, they are not explicitly defined within the provided script. These methods are likely inherited from the IntegrationBrain or Integration class, as NotionConnector extends these classes. The get_integration_brain method is likely used to fetch integration details, and the update_integration_brain method is likely used to update the integration_details object with new credentials and then presumably save these updated details to a database or some form of storage. However, without the implementation details of these methods or the parent classes (IntegrationBrain, Integration), it's not possible to provide a detailed explanation of how they work.

You can find more details in the Notion_connector.py file in the repository.

Sources

#### About Dosu This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.
JonWhiteRA commented 5 months ago

Related to this?:

worker | [2024-03-27 20:45:00,043: ERROR/ForkPoolWorker-2] Task celery_worker.process_integration_brain_sync[7998e013-f15b-4a8b-9b0c-77e724659f88] raised unexpected: ConnectError('[Errno -2] Name or service not known') worker | Traceback (most recent call last): worker | File "/usr/local/lib/python3.11/site-packages/httpx/_transports/default.py", line 66, in map_httpcore_exceptions worker | yield worker | File "/usr/local/lib/python3.11/site-packages/httpx/_transports/default.py", line 228, in handle_request worker | resp = self._pool.handle_request(req) worker | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ worker | File "/usr/local/lib/python3.11/site-packages/httpcore/_sync/connection_pool.py", line 216, in handle_request worker | raise exc from None worker | File "/usr/local/lib/python3.11/site-packages/httpcore/_sync/connection_pool.py", line 196, in handle_request worker | response = connection.handle_request( worker | ^^^^^^^^^^^^^^^^^^^^^^^^^^ worker | File "/usr/local/lib/python3.11/site-packages/httpcore/_sync/connection.py", line 99, in handle_request worker | raise exc worker | File "/usr/local/lib/python3.11/site-packages/httpcore/_sync/connection.py", line 76, in handle_request worker | stream = self._connect(request) worker | ^^^^^^^^^^^^^^^^^^^^^^ worker | File "/usr/local/lib/python3.11/site-packages/httpcore/_sync/connection.py", line 122, in _connect worker | stream = self._network_backend.connect_tcp(kwargs) worker | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ worker | File "/usr/local/lib/python3.11/site-packages/httpcore/_backends/sync.py", line 205, in connect_tcp worker | with map_exceptions(exc_map): worker | File "/usr/local/lib/python3.11/contextlib.py", line 155, in exit worker | self.gen.throw(typ, value, traceback) worker | File "/usr/local/lib/python3.11/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions worker | raise to_exc(exc) from exc worker | httpcore.ConnectError: [Errno -2] Name or service not known worker | worker | The above exception was the direct cause of the following exception: worker | worker | Traceback (most recent call last): worker | File "/usr/local/lib/python3.11/site-packages/celery/app/trace.py", line 477, in trace_task worker | R = retval = fun(*args, *kwargs) worker | ^^^^^^^^^^^^^^^^^^^^ worker | File "/usr/local/lib/python3.11/site-packages/celery/app/trace.py", line 760, in __protected_call__ worker | return self.run(args, kwargs) worker | ^^^^^^^^^^^^^^^^^^^^^^^^^ worker | File "/code/celery_worker.py", line 187, in process_integration_brain_sync worker | integrations = integration.get_integration_brain_by_type_integration("notion") worker | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ worker | File "/code/modules/brain/repository/integration_brains.py", line 110, in get_integration_brain_by_type_integration worker | .execute() worker | ^^^^^^^^^ worker | File "/usr/local/lib/python3.11/site-packages/postgrest/_sync/request_builder.py", line 58, in execute worker | r = self.session.request( worker | ^^^^^^^^^^^^^^^^^^^^^ worker | File "/usr/local/lib/python3.11/site-packages/httpx/_client.py", line 814, in request worker | return self.send(request, auth=auth, follow_redirects=follow_redirects) worker | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ worker | File "/usr/local/lib/python3.11/site-packages/httpx/_client.py", line 901, in send worker | response = self._send_handling_auth( worker | ^^^^^^^^^^^^^^^^^^^^^^^^^ worker | File "/usr/local/lib/python3.11/site-packages/httpx/_client.py", line 929, in _send_handling_auth worker | response = self._send_handling_redirects( worker | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ worker | File "/usr/local/lib/python3.11/site-packages/httpx/_client.py", line 966, in _send_handling_redirects worker | response = self._send_single_request(request) worker | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ worker | File "/usr/local/lib/python3.11/site-packages/httpx/_client.py", line 1002, in _send_single_request worker | response = transport.handle_request(request) worker | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ worker | File "/usr/local/lib/python3.11/site-packages/httpx/_transports/default.py", line 227, in handle_request worker | with map_httpcore_exceptions(): worker | File "/usr/local/lib/python3.11/contextlib.py", line 155, in exit worker | self.gen.throw(typ, value, traceback) worker | File "/usr/local/lib/python3.11/site-packages/httpx/_transports/default.py", line 83, in map_httpcore_exceptions worker | raise mapped_exc(message) from exc worker | httpx.ConnectError: [Errno -2] Name or service not known

github-actions[bot] commented 2 months ago

Thanks for your contributions, we'll be closing this issue as it has gone stale. Feel free to reopen if you'd like to continue the discussion.