run-llama / llama-hub

A library of data loaders for LLMs made by the community -- to be used with LlamaIndex and/or LangChain
https://llamahub.ai/
MIT License
3.44k stars 729 forks source link

Microsoft SharePoint Data Loader #745

Closed arun-soliton closed 9 months ago

arun-soliton commented 9 months ago

Description

Microsoft SharePoint Data Loader for llama-hub

Fixes # (issue) https://github.com/run-llama/llama-hub/issues/674

Type of Change

Please delete options that are not relevant.

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

Suggested Checklist:

anoopshrma commented 9 months ago

Awesome! Thanks for the contribution. Could you check the conflict and checks that failed!!

Thanks!

arun-soliton commented 9 months ago

@anoopshrma Updated. Kindly check!

arun-soliton commented 9 months ago

@anoopshrma @EmanuelCampos Thank you for accepting the PR and approving it :slightly_smiling_face:!

lilz12 commented 7 months ago

I am trying to use llama_index SharePointLoader and am getting error that my site does not exist, but the link that prints out in the error message works and brings me to my site. Any ideas?

arun04cbe commented 7 months ago

@lilz12 https://github.com/run-llama/llama-hub/issues/901

lilz12 commented 7 months ago

@arun04cbe Thank you! I wasn't sure if that was the same issue. Do you have example of string to pass for SharePoint site? This is what I am trying:

loader = SharePointLoader( client_id = "app_client_id", client_secret = "app_cert_secrete_value", tenant_id = "azure_tenant_id" )

documents = loader.load_data( sharepoint_site_name = "https://wmsafety.sharepoint.com/sites/WMSC", sharepoint_folder_path = "Reference%20Codes", recursive = True, )

ERROR:custom_module:An error occurred while accessing SharePoint: The specified sharepoint site https://wmsafety.sharepoint.com/sites/WMSC is not found.

arun04cbe commented 7 months ago

@lilz12 Just pass the SharePoint site name. In your case just pass WMSC. And if the folder in documents library is Data/Folder. Pass Data/Folder in folder path

lilz12 commented 7 months ago

@arun04cbe Within the SharePoint site, it looks like you can have additional sites. So once I'm at my SharePoint site name, I navigate to the site, "Technical Writers" and then the folder is "Reference Codes". Could that be the problem that there are multiple sites under WMSC?

arun04cbe commented 7 months ago

I am not aware on having additional sites in SharePoint site. Current implementation works for the parent Site. There is also an issues mentioned at #901. Kindly take a look at it also.

lilz12 commented 7 months ago

Hi! Are you saying what i'm experiencing is already a known/existing bug?

documents = loader.load_data( sharepoint_site_name = "WMSC", sharepoint_folder_path = "Reference%20Codes", recursive = True, ) ERROR:custom_module:An error occurred while accessing SharePoint: {'code': 'itemNotFound', 'message': 'The resource could not be found.'}

Screenshot of site below

[image: image.png]

On Wed, Jan 31, 2024 at 9:21 AM arun_n @.***> wrote:

I am not aware on having additional sites in SharePoint. Current implementation works for the parent Site

— Reply to this email directly, view it on GitHub https://github.com/run-llama/llama-hub/pull/745#issuecomment-1919201170, or unsubscribe https://github.com/notifications/unsubscribe-auth/A565JAJRPDGLPE5KYTFBERDYRJHNLAVCNFSM6AAAAABAKN4KZWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJZGIYDCMJXGA . You are receiving this because you were mentioned.Message ID: @.***>

arun04cbe commented 7 months ago

@lilz12 could you try giving Folder path as Reference/Codes

lilz12 commented 7 months ago

ERROR:custom_module:An error occurred while accessing SharePoint: {'code': 'itemNotFound', 'message': 'The resource could not be found.'}


On Wed, Jan 31, 2024 at 9:33 AM arun_n @.***> wrote:

@lilz12 https://github.com/lilz12 could you try giving Folder path as Reference/Codes

— Reply to this email directly, view it on GitHub https://github.com/run-llama/llama-hub/pull/745#issuecomment-1919224433, or unsubscribe https://github.com/notifications/unsubscribe-auth/A565JAMXLKUUAVSHOED3W6TYRJI2LAVCNFSM6AAAAABAKN4KZWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJZGIZDINBTGM . You are receiving this because you were mentioned.Message ID: @.***>