Closed sjors101 closed 6 months ago
@sjors101 thanks for filing the enhancement!
At which data source are you looking? You might want to consider running an incremental sync on your content, that would update only the changes that occurred since the last content sync.
Mainly the sharepoint server data source. I wrote some code that ingests the allow_access_control fields (considering pushing it back to the community once it is mature enough). I had a look at incremental sync, however with Sharepoint on-prem it seems it does not have a timestamp / indicator when the access control of an object changes.
I'd definitely be curious to see the changes you're making, even if they're still in-progress. My initial reaction is that this wouldn't be possible to do in most cases, because Full syncs and Access Control syncs are often fetching two disparate sets of data from the 3rd party. In order to make Access Control syncs update all the documents impacted by permissions, you'd be significantly increasing the scope of your Access Control sync, and then you're back in the place you started, where "full content sync are quite costly."
I'm inclined to agree with Dana, that the "right" solution to this problem is Incremental syncs. Which, as you say, are not in a perfect spot with the Sharepoint Server today. But that's where I'd direct investment to solve this problem.
However, if you've spotted an approach I'm not thinking of, I'd love to understand better. Don't hesitate to put up a Draft PR. :)
My project is leaning heavy on Sharepoint server as datasource, i made quite some changes to make sure the connector fit our needs. I will try to push some stuff your way soon :)
The main bottleneck is Sharepoint not tracking a modified timestamp when permissions are changed. It seems it can be enabled in an audit log, but thats not something I would like to use. Guess i will see if i can build something like a lightweight full sync to just get just the permissions synced up. Thanks Sean, Dana for your response.
Problem Description
Access control fields of objects change a lot, this requires use to craw frequently to keep up with the latest changes. However full content sync are quite costly.
We would like to have the option when we run the Access control sync to be able to update the documents with access control fields in the main index
search-<INDEX-NAME>
. Currently the framework is designed, when we run an Access control sync, it only ingests the dls profile in the acl index.search-acl-filter-<INDEX-NAME>
.Proposed Solution
It would be nice to have an option in the
get_access_control
function in theBaseDataSource
, to also ingest to the main index (search-<INDEX-NAME>
) of an connector.Alternatives
Add a variable to switch two paths in the Full content sync. One for the full content, and one for just the access control fields.