Closed Ashish5869 closed 4 months ago
You can check this PR https://github.com/jerryjliu/llama_index/pull/6721
add connection
with your own token on your notion pageAfter adding connection, still it doesn't work. Getting the same error.
KeyError Traceback (most recent call last) Cell In[5], line 3 1 integration_token = os.getenv("NOTION_INTEGRATION_TOKEN") 2 page_ids = ["All-In-Capital-512240051c104b048b3f768c71a2709b"] ----> 3 documents = NotionPageReader(integration_token=integration_token).load_data( 4 page_ids=page_ids 5 )
File /opt/homebrew/lib/python3.9/site-packages/llama_index/readers/notion.py:161, in NotionPageReader.load_data(self, page_ids, database_id) 159 else: 160 for page_id in page_ids: --> 161 page_text = self.read_page(page_id) 162 docs.append(Document(text=page_text, metadata={"page_id": page_id})) 164 return docs
File /opt/homebrew/lib/python3.9/site-packages/llama_index/readers/notion.py:95, in NotionPageReader.read_page(self, page_id) 93 def read_page(self, page_id: str) -> str: 94 """Read a page.""" ---> 95 return self._read_block(page_id)
File /opt/homebrew/lib/python3.9/site-packages/llama_index/readers/notion.py:60, in NotionPageReader._read_block(self, block_id, num_tabs) 55 res = requests.request( 56 "GET", block_url, headers=self.headers, json=query_dict 57 ) ... ---> 60 for result in data["results"]: 61 result_type = result["type"] 62 result_obj = result[result_type]
KeyError: 'results'
@livelikeabel Notion is working now but google drive is giving a token refresh error.
Google docs is working
from llama_index import GoogleDocsReader
document_ids = ["google_doc_id"]
documents = GoogleDocsReader().load_data(document_ids=document_ids)
Google drive
from llama_index import download_loader
GoogleDriveReader = download_loader('GoogleDriveReader')
loader = GoogleDriveReader()
documents = loader.load_data(folder_id=None,file_ids=['google_doc_id'], mime_types=['application/vnd.google-apps.document'])
OUTPUT ERROR:
. . .
~/anaconda3/lib/python3.10/site-packages/oauth2client/client.py in _do_refresh_request(self, http)
820 pass
--> 821 raise HttpAccessTokenRefreshError(error_msg, status=resp.status)
822
HttpAccessTokenRefreshError: invalid_grant
During handling of the above exception, another exception occurred:
. . .
~/anaconda3/lib/python3.10/site-packages/pydrive/auth.py in Refresh(self)
475 self.credentials.refresh(self.http)
476 except AccessTokenRefreshError as error:
--> 477 raise RefreshError('Access token refresh failed: %s' % error)
478
479 def GetAuthUrl(self):
RefreshError: Access token refresh failed: invalid_grant
I am using the same credentials for google docs and google drive
Adding the connection worked for me
Hi, @Ashish5869! I'm Dosu, and I'm helping the LlamaIndex team manage their backlog. I wanted to let you know that we are marking this issue as stale.
Based on my understanding, the issue is related to the Google Drive Reader and Notion Reader in the llama_index
library not working properly. The Google Drive Reader is giving a TypeError
and the Notion Reader is giving a KeyError
. You mentioned that you received a suggestion to check a pull request for help, which indicates that there may be a solution available. Another user also suggested adding a connection with a token for the Notion Reader, which could potentially resolve the KeyError
issue. However, it seems that the error still persists for the Google Drive Reader, specifically with a token refresh error.
If this issue is still relevant to the latest version of the LlamaIndex repository, please let the LlamaIndex team know by commenting on this issue. Otherwise, feel free to close the issue yourself or it will be automatically closed in 7 days. Thank you for your understanding and contribution to the LlamaIndex project!
I have the same problem but just for the NotionPageReader
. Followed the instructions by "livelikeabel" but still getting the KeyError: 'results'
. I did change the capabilities of the integration in notion setting but not working. @sidhartha-roy Could you specify which capabilities your integration had, that it worked for you?
@Disiok Could you please help @amindadgar with the issue related to the NotionPageReader
in the LlamaIndex library? They are experiencing a KeyError: 'results'
even after following the instructions provided by another user. It seems that changing the capabilities of the integration in Notion settings did not resolve the issue. Thank you for your assistance!
@livelikeabel can you please help how you solved the notion issue ?
Hi, for me it worked after the steps above: adding the Connection. But one important hint: Only use the "hash" as page-id. In my example the page-id is exported as following:
https://www.notion.so/SuccessFactors-To-P-I-Loga-Replication-74c9b1046e614e78870b23f8f6b4df6d?pvs=4
So the page-id to set for the notion_loader is only "74c9b1046e614e78870b23f8f6b4df6d". I get the same error as you did, if I used it with Page Title included.
regards
Thank you @sebi5000. This got me functional.
The KeyError: 'results'
you're encountering in the NotionPageReader
class from the LlamaIndex library suggests that the response from the Notion API does not contain a 'results' key. This error typically occurs when the API response structure is different from what is expected or when the request fails for some reason (e.g., due to incorrect permissions, invalid request parameters, or an outage on the Notion API side).
Given the context and the provided code, here are a few steps to resolve the issue:
Check API Response Status Code: Before accessing the data['results']
, ensure that the request was successful by checking the response's status code. A successful request to the Notion API should return a status code of 200. If the status code indicates an error (e.g., 400, 401, 403, 404, etc.), it can help identify the issue.
Error Handling: Implement error handling to catch cases where the 'results' key might be missing in the response. This can be done by using the .get()
method, which returns None
if the specified key does not exist, instead of directly accessing the key with data['results']
.
Logging: Add logging to print out the API response. This can help you understand the structure of the response and why the 'results' key might be missing. It's possible that the API is returning an error message or a different structure than expected.
Here's an example of how you might modify the _read_block
method to include these suggestions:
def _read_block(self, block_id: str, num_tabs: int = 0) -> str:
"""Read a block."""
done = False
result_lines_arr = []
cur_block_id = block_id
while not done:
block_url = BLOCK_CHILD_URL_TMPL.format(block_id=cur_block_id)
query_dict: Dict[str, Any] = {}
res = requests.request(
"GET", block_url, headers=self.headers, json=query_dict
)
if res.status_code != 200:
print(f"Failed to read block with status code: {res.status_code}")
print("Response:", res.text)
break # or handle the error as appropriate
data = res.json()
# Use .get() to avoid KeyError if 'results' key is missing
for result in data.get("results", []):
# Existing processing logic here...
if data.get("next_cursor") is None:
done = True
else:
cur_block_id = data["next_cursor"]
return "\n".join(result_lines_arr)
By implementing these changes, you can avoid the KeyError
and gain insight into why the error might be occurring, which can help in further troubleshooting and resolving the issue.
For more information, you can refer to the NotionPageReader class in the LlamaIndex library.
Google docs is working but google drive reader is not working
Notion reader is also giving an error.
How to resolve this error?