elastic / connectors-ruby

Official Connector Clients for Elastic Elasticsearch, Enterprise Search, App Search and Workplace Search
https://www.elastic.co/guide/en/enterprise-search/master/index.html
Other
9 stars 17 forks source link

[SharePoint] Only traverse groups that user has access to and handle GraphAPI request errors #544

Open izmaxxsun opened 1 year ago

izmaxxsun commented 1 year ago

Bug Description

As part of an enhancement to sync private Teams sites without having to wait for a propagation delay in Microsoft, there is logic which loops through sites that are associated with groups returned by the Graph API "groups" endpoint.

The "groups" endpoint lists all the groups within the organization, including those that the connecting user may not have access to or that there might not be associated Sharepoint sites created. The end result as the sync is performed is that 404 and 403 issues are encountered and the sync process does not continue.

To Reproduce

Steps to reproduce the behavior:

  1. Create groups that the connecting Azure service account does not have access to (referring to the account selected in Step 3 of the setup
  2. Follow documentation to setup Sharepoint Online connector package
  3. Perform sync
  4. Observe that Sharepoint documents are not synced

Expected behavior

Sharepoint connector should not try to request information from groups it is not a direct/indirect member of. Sync should continue when encountering unsuccessful responses (e.g. 403, 404) and log this information.

Environment

Additional context

This is an example 404 error from the diagnostics bundle:

"createdAt": "2023-02-16T17:03:30Z",

          "status": "error",

          "fatalException": {

            "friendly_message": "Source returned an unexpected error during synchronization.",

            "stack_trace": "/home/username/enterprise-search-8.2.0/lib/war/gems/gems/connectors_sdk-8.2.0.0/lib/connectors_sdk/office365/custom_client.rb:316:in `raise_any_errors\u0027: got a 404 from https://graph.microsoft.com/v1.0/groups/01973b3f-dc6b-415b-b0ca-8f5e3c02c63c/sites/root with query {:$select\u003d\u003e\"id\"}\n\tfrom /home/username/enterprise-search-8.2.0/lib/war/gems/gems/connectors_sdk-8.2.0.0/lib/connectors_sdk/office365/custom_client.rb:331:in `request\u0027\n\tfrom /home/username/enterprise-search-8.2.0/lib/war/gems/gems/connectors_sdk-8.2.0.0/lib/connectors_sdk/office365/custom_client.rb:326:in `request_json\u0027\n\tfrom /home/username/enterprise-search-8.2.0/lib/war/gems/gems/connectors_sdk-8.2.0.0/lib/connectors_sdk/office365/custom_client.rb:322:in `request_endpoint\u0027\n\tfrom /home/username/enterprise-search-8.2.0/lib/war/gems/gems/connectors_sdk-8.2.0.0/lib/connectors_sdk/office365/custom_client.rb:95:in `group_root_site\u0027\n\tfrom /home/username/enterprise-search-8.2.0/lib/war/gems/gems/connectors_sdk-8.2.0.0/lib/connectors_sdk/office365/custom_client.rb:236:in `block in recent_share_point_group_sites\u0027\n\tfrom org/jruby/RubyArray.java:2642:in `map\u0027\n\tfrom /home/username/enterprise-search-8.2.0/lib/war/gems/gems/connectors_sdk-8.2.0.0/lib/connectors_sdk/office365/custom_client.rb:236:in `recent_share_point_group_sites\u0027\n\tfrom /home/username/enterprise-search-8.2.0/lib/war/gems/gems/connectors_sdk-8.2.0.0/lib/connectors_sdk/office365/custom_client.rb:78:in `share_point_drives\u0027\n\tfrom /home/username/enterprise-search-8.2.0/lib/war/gems/gems/connectors_sdk-8.2.0.0/lib/connectors_sdk/share_point/extractor.rb:27:in `drives\u0027\n\tfrom /home/username/enterprise-search-8.2.0/lib/war/gems/gems/connectors_sdk-8.2.0.0/lib/connectors_sdk/office365/extractor.rb:120:in `drives_to_index\u0027\n\tfrom /home/username/enterprise-search-8.2.0/lib/war/gems/gems/connectors_sdk-8.2.0.0/lib/connectors_sdk/office365/extractor.rb:75:in `retrieve_latest_cursors\u0027\n\tfrom /home/username/enterprise-search-8.2.0/lib/war/connectors/lib/connectors/work/purge.class:60:in `complete_run!\u0027\n\tfrom /home/username/enterprise-search-8.2.0/lib/war/connectors/lib/connectors/work/purge.class:35:in `block in run\u0027\n\tfrom /home/username/enterprise-search-8.2.0/lib/war/connectors/lib/connectors/work/abstract_extractor_work.class:100:in `run_with_suspension\u0027\n\tfrom /home/username/enterprise-search-8.2.0/lib/war/connectors/lib/connectors/work/purge.class:7:in `run\u0027\n\tfrom /home/username/enterprise-search-8.2.0/lib/war/connectors/lib/connectors/work/abstract_extractor_work.class:138:in `execute\u0027\n\tfrom /home/username/enterprise-search-8.2.0/lib/war/connectors/lib/connectors/workers/extract_worker.class:7:in `block in run\u0027\n\tfrom /home/username/enterprise-search-8.2.0/lib/war/gems/gems/concurrent-ruby-1.1.9/lib/concurrent-ruby/concurrent/executor/safe_task_executor.rb:24:in `block in execute\u0027\n\tfrom /home/username/enterprise-search-8.2.0/lib/war/gems/gems/concurrent-ruby-1.1.9/lib/concurrent-ruby/concurrent/executor/safe_task_executor.rb:19:in `execute\u0027\n\tfrom /home/username/enterprise-search-8.2.0/lib/war/connectors/config/initializers/concurrent.class:18:in `block in realize\u0027\n\tfrom /home/username/enterprise-search-8.2.0/lib/war/gems/gems/concurrent-ruby-1.1.9/lib/concurrent-ruby/concurrent/executor/java_executor_service.rb:79:in `run\u0027\n",

            "id": "63ee62976786a6bbb45b0177",

            "message": "got a 404 from https://graph.microsoft.com/v1.0/groups/01973b3f-dc6b-415b-b0ca-8f5e3c02c63c/sites/root with query {:$select\u003d\u003e\"id\"}",

            "class": "ConnectorsSdk::Office365::CustomClient::ClientError"

          },

          "errorReason": "client_error",

          "completedAt": "2023-02-16T17:06:31Z",

          "durationSeconds": 181.0,

          "documentErrorCount": 0,

          "documentErrors": []