elastic / connectors

Official Elastic connectors for third-party data sources
https://www.elastic.co/guide/en/elasticsearch/reference/master/es-connectors.html
Other
18 stars 136 forks source link

Outlook Connector Crashes when an SMTP address does not have a mailbox associated #2931

Open josephschultz-expedient opened 2 weeks ago

josephschultz-expedient commented 2 weeks ago

Bug Description

While deploying the Outlook Connector for Exchange Online, I found that the process will crash about half way through the sync operation with this error:

TransportError: No valid version headers found in response (ErrorNonExistentMailbox('The SMTP address has no mailbox associated with it.'))

This is a small dev environment with only about 15 users and the only SMTP objects without mailboxes would be a Microsoft 365 Group Email.

This environment was configured as per the Elastic documentation. The only caveat would be that I had to add the Microsoft Graph User.Read.All permission in order to retrieve the user list. Otherwise it returned a 403 error

aiohttp.client_exceptions.ClientResponseError: 403, message='Forbidden', url='https://graph.microsoft.com/v1.0/users?$top=999'

To Reproduce

Steps to reproduce the behavior:

  1. Verify that the Exchange Online environment has at least one Microsoft 365 Group Email created
  2. When creating the new App in Azure, assign both the Exchange Online full_access_as_app and Microsoft Graph User.Read.All permissions
  3. Deploy as per the documentation

Expected behavior

Unless there is a different approach of retrieving the SMTP or user listing, the ideal behavior would be skipping SMTP addresses without mailboxes associated.

Environment

Error Logs

[FMWK][18:26:01][ERROR] [Connector id: VOzQ1JIB5RYrKcWA6hjz, index name: client-outlook-connector, Sync job id: zdas3pIBsfHVd-ovtKc9] Extractor failed with an error: No valid version headers found in response (ErrorNonExistentMailbox('The SMTP address has no mailbox associated with it.'))
[FMWK][18:26:01][CRITICAL] [Connector id: VOzQ1JIB5RYrKcWA6hjz, index name: client-outlook-connector, Sync job id: zdas3pIBsfHVd-ovtKc9] Document extractor failed
Traceback (most recent call last):
  File "/app/lib/python3.10/site-packages/exchangelib/version.py", line 202, in guess
    list(ConvertId(protocol=protocol).call([AlternateId(id="DUMMY", format=EWS_ID, mailbox="DUMMY")], ENTRY_ID))
  File "/app/lib/python3.10/site-packages/exchangelib/services/common.py", line 216, in _elems_to_objs
    for elem in elems:
  File "/app/lib/python3.10/site-packages/exchangelib/services/common.py", line 278, in _chunked_get_elements
    yield from self._get_elements(payload=payload_func(chunk, **kwargs))
  File "/app/lib/python3.10/site-packages/exchangelib/services/common.py", line 299, in _get_elements
    yield from self._response_generator(payload=payload)
  File "/app/lib/python3.10/site-packages/exchangelib/services/common.py", line 262, in _response_generator
    response = self._get_response_xml(payload=payload)
  File "/app/lib/python3.10/site-packages/exchangelib/services/common.py", line 408, in _get_response_xml
    return self._get_soap_messages(body=body, **parse_opts)
  File "/app/lib/python3.10/site-packages/exchangelib/services/common.py", line 496, in _get_soap_messages
    self._raise_soap_errors(fault=fault)  # Will throw SOAPError or custom EWS error
  File "/app/lib/python3.10/site-packages/exchangelib/services/common.py", line 536, in _raise_soap_errors
    raise vars(errors)[code](msg)
exchangelib.errors.ErrorNonExistentMailbox: The SMTP address has no mailbox associated with it.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/app/connectors/es/sink.py", line 492, in run
    await self.get_docs(generator, skip_unchanged_documents=True)
  File "/app/connectors/es/sink.py", line 541, in get_docs
    async for count, doc in aenumerate(generator):
  File "/app/connectors/utils.py", line 856, in aenumerate
    async for elem in asequence:
  File "/app/connectors/logger.py", line 247, in __anext__
    return await self.gen.__anext__()
  File "/app/connectors/es/sink.py", line 523, in _decorate_with_metrics_span
    async for doc in generator:
  File "/app/connectors/sync_job_runner.py", line 454, in prepare_docs
    async for doc, lazy_download, operation in self.generator():
  File "/app/connectors/sync_job_runner.py", line 505, in generator
    async for doc, lazy_download in self.data_provider.get_docs(
  File "/app/connectors/sources/outlook.py", line 1057, in get_docs
    async for account in self.client._get_user_instance.get_user_accounts():
  File "/app/connectors/sources/outlook.py", line 450, in get_user_accounts
    user_account = Account(
  File "/app/lib/python3.10/site-packages/exchangelib/account.py", line 205, in __init__
    self.version = self.protocol.version.copy()
  File "/app/lib/python3.10/site-packages/exchangelib/protocol.py", line 480, in version
    self.config.version = Version.guess(self, api_version_hint=self.api_version_hint)
  File "/app/lib/python3.10/site-packages/exchangelib/version.py", line 206, in guess
    raise TransportError(f"No valid version headers found in response ({e!r})")
exchangelib.errors.TransportError: No valid version headers found in response (ErrorNonExistentMailbox('The SMTP address has no mailbox associated with it.'))
seanstory commented 11 hours ago

Hi, @josephschultz-expedient! Thanks for filing.

I think https://github.com/elastic/connectors/pull/2967 will fix this. Can you give that branch/diff a try, and see if it resolves your issue?