microsoft / PubSec-Info-Assistant

Information Assistant, built with Azure OpenAI Service, Industry Accelerator
MIT License
263 stars 519 forks source link

Sharepoint ingest missing some files under subfolder #770

Closed DevPaulLiu closed 3 days ago

DevPaulLiu commented 1 week ago

Bug Details

In the config file, I have defined "SharepointSites": [ { "url": "https://microsoft.sharepoint.com/teams/ExampleTeam", "folder": "/Shared Documents/General/InformationAssistantTestData"} ]

And after I run the logic app, only 1 subfolder's files are uploaded, missing 2 subfolders' files. the pdf file under ClimateChange and DominicanRepublic were not uploaded. image image

See the container files. image

Is this by design? or only 1 subfolder are supported?

DevPaulLiu commented 1 week ago

By looking into the logic app execution history, it appears the Microsoft folder is the last UncrawledFolders, and this value will not be reset under the root folder. It's only reset in the site folder level.

There are 6 items under the site, 4 folders, so it add those folders to UncrawledFolders, and run loop five times include the first time for root folder, but the UncrawledFolders seems not reset under the 5 loops. For each FileFolder.txt image

KronemeyerJoshua commented 1 week ago

Hey DevPaulLiu!

We appreciate you bringing this to our attention. It seems in tweaking our logic app, some variables may have gotten mixed up. Sorry about that!

A PR has been created to address these issues in our next update, which I have attached to this issue.

Please let me know if you need anymore assistance

DevPaulLiu commented 5 days ago

Thanks for quick turn around!