Open Prem-Nitin opened 3 months ago
Hey @Prem-Nitin, great to see you back! Hope you’re doing fantastic. 😄 Let’s dive into this SharePoint puzzle together.
The error "Expecting value: line 1 column 1 (char 0)" typically occurs when trying to decode or load JSON data from an empty string or improperly formatted JSON. In the context of accessing documents in SharePoint using the SharePointReader in LlamaHub, this error can arise from several points where JSON data is expected from the SharePoint API responses.
Given the provided code, the error could be happening in multiple places where response.json()
is called without checking if the response body is empty or not. This can occur if the SharePoint API returns an empty response or if there's a network issue that results in an empty response body. Here are potential points in the code where this issue might arise:
_get_access_token
method, when attempting to decode the JSON response from the token endpoint._get_site_id_with_host_name
method, when trying to decode the JSON response to find the site ID._get_drive_id
method, when decoding the JSON response to get the drive ID._get_sharepoint_folder_id
method, when trying to decode the JSON response to find the folder ID._download_files_and_extract_metadata
method, when decoding the JSON response to iterate over files and folders._download_file_by_url
method, although it primarily deals with downloading the file content, if there's an attempt to parse JSON metadata from the response, it could also be a point of failure if the response is empty.To address this issue, you should ensure that the response body is not empty before attempting to parse it as JSON. This can be done by checking response.content
or response.text
before calling response.json()
. Additionally, it's important to check the status code of the response to ensure that the request was successful before attempting to parse the response body.
Here's an example of how you might modify the _get_access_token
method to include such a check:
def _get_access_token(self) -> str:
authority = f"https://login.microsoftonline.com/{self.tenant_id}/oauth2/token"
payload = {
"grant_type": "client_credentials",
"client_id": self.client_id,
"client_secret": self.client_secret,
"resource": "https://graph.microsoft.com/",
}
response = requests.post(
url=authority,
data=payload,
)
if response.status_code == 200 and response.content:
json_response = response.json()
if "access_token" in json_response:
return json_response["access_token"]
else:
logger.error(json_response.get("error", "Unknown error"))
raise ValueError(json_response.get("error_description", "No error description provided."))
else:
logger.error("Failed to obtain access token. Status code: %s", response.status_code)
raise ValueError("Failed to obtain access token.")
This modification includes checks for response.content
and provides more detailed error logging and handling. Similar checks should be added wherever response.json()
is called.
You can find the source code for the SharePointReader class in the LlamaIndex repository here.
Please note that modifying the source code of the LlamaIndex library is not recommended unless you are familiar with its structure and functionality. If you are not comfortable making these changes, I would recommend reaching out to the LlamaIndex maintainers or the community for further assistance.
If you have any further questions or need additional clarification, please let me know.
Question Validation
Question
loader = SharePointReader( client_id = client_id, client_secret=client_secret, tenant_id=tenant_id, )
documents = loader.load_data( sharepoint_site_name=site, sharepoint_folder_path=folder_url, recursive=True, ) As given in llamahub I implemented the sharepoint reader and got the below error
An error occurred while accessing SharePoint: Expecting value: line 1 column 1 (char 0)