ecederstrand / exchangelib

Python client for Microsoft Exchange Web Services (EWS)
BSD 2-Clause "Simplified" License
1.17k stars 250 forks source link

Possible to access Personal Archive on Exchange 2010? #458

Closed slash5k1 closed 6 years ago

slash5k1 commented 6 years ago

Howdy,

My company is running Exchange2010_SP2 and they have implemented personal archiving. I would like to access the emails in the personal archive however I cant seem to find the folder when I run: account.root.walk()

when I look at folder.py, I can see:

class ArchiveDeletedItems(WellknownFolder):
    DISTINGUISHED_FOLDER_ID = 'archivedeleteditems'
    supported_from = EXCHANGE_2010_SP1

class ArchiveInbox(WellknownFolder):
    DISTINGUISHED_FOLDER_ID = 'archiveinbox'
    supported_from = EXCHANGE_2013_SP1

class ArchiveMsgFolderRoot(WellknownFolder):
    DISTINGUISHED_FOLDER_ID = 'archivemsgfolderroot'
    supported_from = EXCHANGE_2010_SP1

class ArchiveRecoverableItemsDeletions(WellknownFolder):
    DISTINGUISHED_FOLDER_ID = 'archiverecoverableitemsdeletions'
    supported_from = EXCHANGE_2010_SP1

class ArchiveRecoverableItemsPurges(WellknownFolder):
    DISTINGUISHED_FOLDER_ID = 'archiverecoverableitemspurges'
    supported_from = EXCHANGE_2010_SP1

class ArchiveRecoverableItemsRoot(WellknownFolder):
    DISTINGUISHED_FOLDER_ID = 'archiverecoverableitemsroot'
    supported_from = EXCHANGE_2010_SP1

class ArchiveRecoverableItemsVersions(WellknownFolder):
    DISTINGUISHED_FOLDER_ID = 'archiverecoverableitemsversions'
    supported_from = EXCHANGE_2010_SP1

class ArchiveRoot(WellknownFolder):
    DISTINGUISHED_FOLDER_ID = 'archiveroot'
    supported_from = EXCHANGE_2010_SP1

So I thought I would try

mailbox = account.archive_inbox
query = mailbox.all()

but I got the following exception:

Exception in _get_elements: Traceback (most recent call last):
  File "/Users/canders/Code/python3-testing/lib/python3.6/site-packages/cached_property.py", line 69, in __get__
    return obj_dict[name]
KeyError: 'archive_inbox'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/canders/Code/python3-testing/lib/python3.6/site-packages/exchangelib/services.py", line 88, in _get_elements
    response = self._get_response_xml(payload=payload)
  File "/Users/canders/Code/python3-testing/lib/python3.6/site-packages/exchangelib/services.py", line 189, in _get_response_xml
    raise rme
  File "/Users/canders/Code/python3-testing/lib/python3.6/site-packages/exchangelib/services.py", line 171, in _get_response_xml
    res = self._get_soap_payload(soap_response=soap_response_payload)
  File "/Users/canders/Code/python3-testing/lib/python3.6/site-packages/exchangelib/services.py", line 227, in _get_soap_payload
    cls._raise_soap_errors(fault=fault)  # Will throw SOAPError or custom EWS error
  File "/Users/canders/Code/python3-testing/lib/python3.6/site-packages/exchangelib/services.py", line 261, in _raise_soap_errors
    raise vars(errors)[code](msg)
exchangelib.errors.ErrorSchemaValidation: The request failed schema validation.

Traceback (most recent call last):
  File "/Users/canders/Code/python3-testing/lib/python3.6/site-packages/cached_property.py", line 69, in __get__
    return obj_dict[name]
KeyError: 'archive_inbox'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/canders/Code/prod/email_walk.py", line 105, in <module>
    mailbox = account.archive_inbox
  File "/Users/canders/Code/python3-testing/lib/python3.6/site-packages/cached_property.py", line 73, in __get__
    return obj_dict.setdefault(name, self.func(obj))
  File "/Users/canders/Code/python3-testing/lib/python3.6/site-packages/exchangelib/account.py", line 130, in archive_inbox
    return self.root.get_default_folder(ArchiveInbox)
  File "/Users/canders/Code/python3-testing/lib/python3.6/site-packages/exchangelib/folders.py", line 965, in get_default_folder
    for f in self._folders_map.values():
  File "/Users/canders/Code/python3-testing/lib/python3.6/site-packages/exchangelib/folders.py", line 928, in _folders_map
    for f in FolderCollection(account=self.account, folders=distinguished_folders).get_folders():
  File "/Users/canders/Code/python3-testing/lib/python3.6/site-packages/exchangelib/services.py", line 1053, in call
    shape=shape,
  File "/Users/canders/Code/python3-testing/lib/python3.6/site-packages/exchangelib/services.py", line 88, in _get_elements
    response = self._get_response_xml(payload=payload)
  File "/Users/canders/Code/python3-testing/lib/python3.6/site-packages/exchangelib/services.py", line 189, in _get_response_xml
    raise rme
  File "/Users/canders/Code/python3-testing/lib/python3.6/site-packages/exchangelib/services.py", line 171, in _get_response_xml
    res = self._get_soap_payload(soap_response=soap_response_payload)
  File "/Users/canders/Code/python3-testing/lib/python3.6/site-packages/exchangelib/services.py", line 227, in _get_soap_payload
    cls._raise_soap_errors(fault=fault)  # Will throw SOAPError or custom EWS error
  File "/Users/canders/Code/python3-testing/lib/python3.6/site-packages/exchangelib/services.py", line 261, in _raise_soap_errors
    raise vars(errors)[code](msg)
exchangelib.errors.ErrorSchemaValidation: The request failed schema validation.

What am I missing ? :)

Thankyou!

ecederstrand commented 6 years ago

Which version of exchangelib is this?

ErrorSchemaValidation is an exception thrown by the server. Apparently, it does not accept the XML request we send.

The first step in debugging this is to capture the request and response XML. See https://github.com/ecederstrand/exchangelib#troubleshooting on how to do that.

ecederstrand commented 6 years ago

@slash5k1 Did you find a solution to this?

slash5k1 commented 6 years ago

Hi @ecederstrand I didnt see the notification that you responded.

I am using

version = '1.11.4'

let me try the troubleshooting and update what I find... I can access the archive from OWA, I just assumed you could access it via EWS?

Cheers :)

ecederstrand commented 6 years ago

You are supposed to be able to do that, but you're assuming exchangelib doesn't have bugs :-)

ecederstrand commented 6 years ago

@slash5k1 Did you find a solution to this?

slash5k1 commented 6 years ago

Hi @ecederstrand I havent been able to find a solution but I am also not sure if our infrastructure is 100% right due to a mixture of 2010 (SP2) frontend servers and 2013 backend servers.

What I have been able to do is make use of

mailbox = account.archive_root.walk()

Which has pulled out a tiny subset of messages (70 out of 10K) from the inbox of my archive but I cant see any folders under the archive_root and archive_inbox returns the error as per the post above.

So im not sure whats going on... is there something that I should be aware of with the walk method ?

Making use of: logging.basicConfig(level=logging.DEBUG, handlers=[PrettyXmlHandler()])

I can see:

DEBUG:exchangelib.folders:Testing default <class 'exchangelib.folders.ArchiveRoot'> folder with GetFolder
DEBUG:exchangelib.services:Getting folder ArchiveDeletedItems (archivedeleteditems)
DEBUG:exchangelib.services:Getting folder ArchiveMsgFolderRoot (archivemsgfolderroot)
DEBUG:exchangelib.services:Getting folder ArchiveRecoverableItemsDeletions (archiverecoverableitemsdeletions)
DEBUG:exchangelib.services:Getting folder ArchiveRecoverableItemsPurges (archiverecoverableitemspurges)
DEBUG:exchangelib.services:Getting folder ArchiveRecoverableItemsRoot (archiverecoverableitemsroot)
DEBUG:exchangelib.services:Getting folder ArchiveRecoverableItemsVersions (archiverecoverableitemsversions)
DEBUG:exchangelib.services:Getting folder ArchiveRoot (archiveroot)

🤷‍♂️

slash5k1 commented 6 years ago

Oh and here is a debug from when I try mailbox = account.archive_inbox

Request headers:
{'User-Agent': 'python-requests/2.18.4', 'Accept-Encoding': 'compress, gzip', 'Accept': '*/*',
 'Connection': 'Keep-Alive', 'Content-Type': 'text/xml; charset=utf-8',
 'Cookie': 'exchangecookie=f38c9553ffec4935b46f00531065579c', 'Content-Length': '1149'}
Response headers:
{'Cache-Control': 'private', 'Transfer-Encoding': 'chunked', 'Content-Type': 'text/xml; charset=utf-8',
 'Server': 'Microsoft-IIS/7.5', 'X-AspNet-Version': '2.0.50727', 'X-Powered-By': 'ASP.NET',
 'Date': 'Thu, 06 Sep 2018 07:47:38 GMT'}
Request data:
<?xml version="1.0" encoding="utf-8"?>
<s:Envelope xmlns:m="http://schemas.microsoft.com/exchange/services/2006/messages"
            xmlns:s="http://schemas.xmlsoap.org/soap/envelope/"
            xmlns:t="http://schemas.microsoft.com/exchange/services/2006/types">
    <s:Header>
        <t:RequestServerVersion Version="Exchange2010_SP2"/>
        <t:TimeZoneContext>
            <t:TimeZoneDefinition Id="Cen. Australia Standard Time"/>
        </t:TimeZoneContext>
    </s:Header>
    <s:Body>
        <m:GetFolder>
            <m:FolderShape>
                <t:BaseShape>IdOnly</t:BaseShape>
                <t:AdditionalProperties>
                    <t:FieldURI FieldURI="folder:ChildFolderCount"/>
                    <t:FieldURI FieldURI="folder:EffectiveRights"/>
                    <t:FieldURI FieldURI="folder:FolderClass"/>
                    <t:FieldURI FieldURI="folder:DisplayName"/>
                    <t:FieldURI FieldURI="folder:ParentFolderId"/>
                    <t:FieldURI FieldURI="folder:TotalCount"/>
                    <t:FieldURI FieldURI="folder:UnreadCount"/>
                </t:AdditionalProperties>
            </m:FolderShape>
            <m:FolderIds>
                <t:DistinguishedFolderId Id="archiveinbox">
                    <t:Mailbox>
                        <t:EmailAddress>chris.anders@xxxxx.xxxx.xxxx</t:EmailAddress>
                        <t:RoutingType>SMTP</t:RoutingType>
                        <t:MailboxType>Mailbox</t:MailboxType>
                    </t:Mailbox>
                </t:DistinguishedFolderId>
            </m:FolderIds>
        </m:GetFolder>
    </s:Body>
</s:Envelope>
Response data:
<?xml version="1.0" encoding="utf-8"?>
<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
    <s:Body>
        <s:Fault>
            <faultcode xmlns:a="http://schemas.microsoft.com/exchange/services/2006/types">a:ErrorSchemaValidation
            </faultcode>
            <faultstring xml:lang="en-US">The request failed schema validation: The \'Id\' attribute is invalid - The
                value \'archiveinbox\' is invalid according to its datatype
                \'http://schemas.microsoft.com/exchange/services/2006/types:DistinguishedFolderIdNameType\' - The
                Enumeration constraint failed.
            </faultstring>
            <detail>
                <e:ResponseCode xmlns:e="http://schemas.microsoft.com/exchange/services/2006/errors">
                    ErrorSchemaValidation
                </e:ResponseCode>
                <e:Message xmlns:e="http://schemas.microsoft.com/exchange/services/2006/errors">The request failed
                    schema validation.
                </e:Message>
                <t:MessageXml xmlns:t="http://schemas.microsoft.com/exchange/services/2006/types">
                    <t:LineNumber>1</t:LineNumber>
                    <t:LinePosition>904</t:LinePosition>
                    <t:Violation>The \'Id\' attribute is invalid - The value \'archiveinbox\' is invalid according to
                        its datatype
                        \'http://schemas.microsoft.com/exchange/services/2006/types:DistinguishedFolderIdNameType\' -
                        The Enumeration constraint failed.
                    </t:Violation>
                </t:MessageXml>
            </detail>
        </s:Fault>
    </s:Body>
</s:Envelope>
ecederstrand commented 6 years ago

I formatted the response XML to make it readable. This reveals that your Exchange version does not support the archiveinbox distinguished folder. This is correct because the server in question is an Exchange 2010_SP2, but ArchiveInbox was introduced in EXCHANGE 2013_SP1. We even know this, because it's marked as such (see your initial post).

So the problem here is that you're calling account.archive_inbox on an account handled by a server that doesn't support this folder. We may want to provide a warning up-front in this situation.

Also, it surprises me that the ErrorSchemaValidation exception instance contained only The request failed schema validation and not the full error message which would have revealed the issue much earlier.

slash5k1 commented 6 years ago

Hmmm...

well whats interesting for me is that if I use:

mailbox = account.archive_root

I can see 1 messages which looks to be the "archive policy"

18/09/2017 3:59:23 AM Exception: Microsoft.Exchange.InfoWorker.Common.IWTransientException: Unable to find a matching retention policy tag in Active Directory for tag 'CN=XXXXXXXXX Personal - 60 days - Move to Archive,CN=Retention Policy Tag Container,CN=XXXXXXXXXA,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=XXXXXXXXXAU,DC=XXXXXXXXX,DC=COM,DC=AU'. Please recycle the MSExchangeMailboxAssistants service or wait 24 hrs for the cache to be refreshed.
   at Microsoft.Exchange.InfoWorker.Common.ELC.AdTagReader.GetTagsInPolicy(MailboxSession session, ADUser aduser, Dictionary`2 allAdTags)
   at Microsoft.Exchange.MailboxAssistants.Assistants.ELC.ElcUserTagInformation.GetAdData()
   at Microsoft.Exchange.MailboxAssistants.Assistants.ELC.ElcUserTagInformation.Build()
   at Microsoft.Exchange.MailboxAssistants.Assistants.ELC.SysCleanupSubAssistant.BuildMailboxData(MailboxSession mailboxSession)
   at Microsoft.Exchange.MailboxAssistants.Assistants.ELC.SysCleanupSubAssistant.Invoke(MailboxSession mailboxSession, MailboxDataForTags& mailboxDataForTags, ElcParameters parameters)
   at Microsoft.Exchange.MailboxAssistants.Assistants.ELC.ELCAssistant.<>c__DisplayClass6.<InvokeInternal>b__0()
   at Microsoft.Exchange.Common.IL.ILUtil.DoTryFilterCatch(TryDelegate tryDelegate, FilterDelegate filterDelegate, CatchDelegate catchDelegate)
12/02/2015 1:56:46 PM Exception: Microsoft.Exchange.InfoWorker.Common.IWTransientException: Unable to find a matching retention policy tag in Active Directory for tag 'CN=XXXXXXXXX Personal - 90 days - Move to Archive,CN=Retention Policy Tag Container,CN=XXXXXXXXXA,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=XXXXXXXXXAU,DC=XXXXXXXXX,DC=COM,DC=AU'. Please recycle the MSExchangeMailboxAssistants service or wait 24 hrs for the cache to be refreshed.
   at Microsoft.Exchange.InfoWorker.Common.ELC.AdTagReader.GetTagsInPolicy(MailboxSession session, ADUser aduser, Dictionary`2 allAdTags)
   at Microsoft.Exchange.MailboxAssistants.Assistants.ELC.ElcUserTagInformation.GetAdData()
   at Microsoft.Exchange.MailboxAssistants.Assistants.ELC.ElcUserTagInformation.Build()
   at Microsoft.Exchange.MailboxAssistants.Assistants.ELC.SysCleanupSubAssistant.BuildMailboxData(MailboxSession mailboxSession)
   at Microsoft.Exchange.MailboxAssistants.Assistants.ELC.SysCleanupSubAssistant.Invoke(MailboxSession mailboxSession, MailboxDataForTags& mailboxDataForTags, ElcParameters parameters)
   at Microsoft.Exchange.MailboxAssistants.Assistants.ELC.ELCAssistant.<>c__DisplayClass6.<InvokeInternal>b__0()
   at Microsoft.Exchange.Common.IL.ILUtil.DoTryFilterCatch(TryDelegate tryDelegate, FilterDelegate filterDelegate, CatchDelegate catchDelegate)
9/02/2015 1:57:20 PM Exception: Microsoft.Exchange.InfoWorker.Common.IWTransientException: Unable to find a matching retention policy tag in Active Directory for tag 'CN=XXXXXXXXX Personal - 180 days - Move to Archive,CN=Retention Policy Tag Container,CN=XXXXXXXXXA,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=XXXXXXXXXAU,DC=XXXXXXXXX,DC=COM,DC=AU'. Please recycle the MSExchangeMailboxAssistants service or wait 24 hrs for the cache to be refreshed.
   at Microsoft.Exchange.InfoWorker.Common.ELC.AdTagReader.GetTagsInPolicy(MailboxSession session, ADUser aduser, Dictionary`2 allAdTags)
   at Microsoft.Exchange.MailboxAssistants.Assistants.ELC.ElcUserTagInformation.GetAdData()
   at Microsoft.Exchange.MailboxAssistants.Assistants.ELC.ElcUserTagInformation.Build()
   at Microsoft.Exchange.MailboxAssistants.Assistants.ELC.SysCleanupSubAssistant.BuildMailboxData(MailboxSession mailboxSession)
   at Microsoft.Exchange.MailboxAssistants.Assistants.ELC.SysCleanupSubAssistant.Invoke(MailboxSession mailboxSession, MailboxDataForTags& mailboxDataForTags, ElcParameters parameters)
   at Microsoft.Exchange.MailboxAssistants.Assistants.ELC.ELCAssistant.<>c__DisplayClass6.<InvokeInternal>b__0()
   at Microsoft.Exchange.Common.IL.ILUtil.DoTryFilterCatch(TryDelegate tryDelegate, FilterDelegate filterDelegate, CatchDelegate catchDelegate)

but then if I make use of your walk method on the archive_root object

mailbox = account.archive_root.walk()

I can retrieve a small subset of email which is in the inbox of my online archive... which I thought was an awesome workaround, if only I could get it to pull out the 15K emails which I have sitting there heh 😢

Happy to close this off, unless there is more to the walk method that may help pull out these pesky emails! 😄

Thankyou!

ecederstrand commented 6 years ago

It seems this may be related to the finding in #396. Can you test the workaound there for archive_root and see if it works for you?

slash5k1 commented 6 years ago

Hi @ecederstrand,

Very promising results!

When I run:

archive = account.archive_root
coll = FolderCollection(account=account, folders=[archive])
for f in coll.find_folders(depth="Shallow"):
    print (f.name, f.total_count)

I see:

AllItems 75870
Calendar 0
Common Views 0
Contacts 0
Conversation Action Settings 0
Deferred Action 0
Drafts 0
Finder 0
Freebusy Data 0
Inbox 0
Journal 0
Junk E-Mail 0
MailboxMoveHistory 2
Notes 0
Outbox 0
Recoverable Items 0
Schedule 0
Sent Items 0
Sharing 0
Shortcuts 0
Spooler Queue 0
System 0
Tasks 0
To-Do Search 379
Top of Information Store 0
Transport Queue 0
Views 0

When I then query "AllItems" which has 75870 items... with:

for f in coll.find_folders(depth="Shallow"):
    if f.name == 'AllItems':
        query = f.all().order_by('datetime_received')[:20]
        break

for item in query:
    print (item.datetime_received, item.subject.encode())

I can now see both inbox and sent item emails from my personal archive!:

2012-10-11 09:08:37+00:00 b'New Email Address' <-- Can see this in Inbox
2012-10-11 09:09:25+00:00 b'RE: New Email Address' <-- Can see this in Sent

Im not entirely sure what this means... but it looks promising 🤣

slash5k1 commented 6 years ago

Just tried removing the depth value and I can now see a second folder for Inbox and Sent Items?:

ie:

archive = account.archive_root
coll = FolderCollection(account=account, folders=[archive])
for f in coll.find_folders():
    print (type(f), f.name, f.total_count)

returns (I have trimmed out the list to only show Inbox and Sent Items):

<class 'exchangelib.folders.AllItems'> AllItems 75870
<class 'exchangelib.folders.Messages'> Inbox 0
<class 'exchangelib.folders.Messages'> Sent Items 0
<class 'exchangelib.folders.Messages'> Inbox 56625
<class 'exchangelib.folders.Messages'> Sent Items 18411
ecederstrand commented 6 years ago

Thanks for the additional tests! If you print f.folder_id you will be able to see if the Inbox folders are in fact different folders, and if they are different from account.inbox.folder_id.

ecederstrand commented 6 years ago

AllItems is maybe a special search folder that contains all items in the other archive folders, but does not itself contain any items.

I just changed account.archive_root to fetch its own folder hierarchy, so account.archive_root.walk() should now find all the archive folders. I don't have access to an account that has any archive folders, so I can't test this myself, unfortunately.

ecederstrand commented 6 years ago

Currently we only support one folder hierarchy, i.e. the account.root hierarchy. I will add transparent support for the archive and public folder hierarchies in #396.

slash5k1 commented 6 years ago

Awesome!

Can I just say - that this is a fantastic library and thankyou very much for putting the effort in for creating and maintaining it :)

Its helped me to extract all the attachments from my exchange account which had all been lost until now!

Thankyou

ecederstrand commented 6 years ago

Thanks for the kind words :-)