ecederstrand / exchangelib

Python client for Microsoft Exchange Web Services (EWS)
BSD 2-Clause "Simplified" License
1.18k stars 248 forks source link

Does the request mime content not support concurrency? #739

Closed kexin9752 closed 4 years ago

kexin9752 commented 4 years ago
    #part of code
    qs = account.inbox.filter(is_read=False).only('attachments')
    qs.page_size = 1
    item = next(qs.iterator())
    item = item.attachments[0].item

Using Multithreading to do the code snippet,it consume more and more time than single thread 'item = item.attachments[0].item',Does this code not support Concurrent?has it locking? I need you help I want to download too lager and too many mail to local,how i do? thank you very very very much!!!!

ecederstrand commented 4 years ago

I'm not sure what MIME content has to do with this?

The requests package apparently has a memory leak in multithreaded mode. There are more details in https://github.com/ecederstrand/exchangelib/issues/675. Apart from that, exchangelib should be thread safe.

If you want parallelism, I suggest using multiprocessing instead.

Finally, if you just want to get one single item from the query, here's an easier approach:

item = account.inbox.filter(is_read=False).only('attachments')[0]
attached_item = item.attachments[0].item
kexin9752 commented 4 years ago

I'm not sure what MIME content has to do with this?

The requests package apparently has a memory leak in multithreaded mode. There are more details in #675. Apart from that, exchangelib should be thread safe.

If you want parallelism, I suggest using multiprocessing instead.

Finally, if you just want to get one single item from the query, here's an easier approach:

item = account.inbox.filter(is_read=False).only('attachments')[0]
attached_item = item.attachments[0].item

The problems have been solved I captured packets and got a API of EWS,I tried to request the API only using 40 threading at the same time,as a result,That this is a big probability events to raise errors of timeout or response. Maybe the API exists some limits for request of concurrent I is using multiprocessing to download mime content of email with the API. 1000 email that is size of 300k to download about wasting 3 minutes and 30 seconds.

ecederstrand commented 4 years ago

You are probably going to be throttled by the remote server if you start 40 processes at a time. By default, each process opens 4 connections: https://github.com/ecederstrand/exchangelib/blob/21c006421d886315740ebda667d07be1ca453fa2/exchangelib/protocol.py#L40 That amounts to 160 TCP connections to the server. That's way above the throttling policy of a standard Exchange account, as far as I know.

Anyway, closing as it seems you have found a solution to your problem.