koltyakov / gosip

⚡️ SharePoint SDK for Go
https://go.spflow.com
MIT License
145 stars 33 forks source link

Question: How to filter or limit Files/Folders view? #52

Closed gkoh closed 2 years ago

gkoh commented 2 years ago

Hello, I followed the excellent explanation here: https://github.com/koltyakov/gosip/issues/36#issuecomment-742368632

and now running into the 5000 item limit.

From what I could find online, adding a Filter() call should restrict the returned files. However, this does not work. Even just adding a Top() call still results in hitting the limit.

I've tried the following:

files, err := folder.Files().
    Select("*").
        Filter("TimeLastModified gt datetime'<some_date>'").
    Get()

and

files, err := folder.Files().
    Select("*").
        Top(100).
    Get()

both fail by hitting the 5000 view limit.

Is what I'm doing actually possible? Should I really be using the Items API instead?

koltyakov commented 2 years ago

Hi @gkoh,

Thanks for using the library.

When it comes to SharePoint throttling there is no universal solutions which would fit any architecture.

I'd suggest trying getting document library's items with filtering based on an indexed field which would limit response results below the threshold limitation. It's important, a filter condition should be based on an indexed field(s) and the filtered number is less than 5000. Unfortunately .Top() is not a helper in a case of throttling as it's applied by the API after a filter and therefore exception.

Filtering by files also should work, but the rule for the filter is identical.

Another workaround is using Search API if it's not an "online" data to show instantly in an app.

In a case of a scheduling processing it might be an iterator by items without a filter and any sorting. Such queries works no matter the size.

If it's a sync process based on new items or delta changes the Changes API might be handy. Please check this sample in a case of a sync demands.

koltyakov commented 2 years ago

It's also pretty common to end up with re-architecting how files are structured and stored in Library, Libraries, Sites to suite process and requests expectations.

gkoh commented 2 years ago

Hello @koltyakov,

Thank you for the library and the quick response!

Also, thank you for the detailed response, this helps my understanding having never interacted with Sharepoint before. In particular I was attempting to filter files based on TimeLastModified, thus I presume this field is either not indexed or the filter is applied after the list limit.

I was eventually actually able to limit the search using the list items API and the Modified column.

I am in fact building a sync application, but syncing both into and from Sharepoint, thus your spsync library solves half the problem. I will look into it further.