ikvk / imap_tools

Work with email by IMAP
Apache License 2.0
706 stars 80 forks source link

Limit number of ids in _fetch_in_bulk #230

Closed sapristi closed 5 months ago

sapristi commented 6 months ago

Hello, One user of mmuxer encountered issues caused by the fact that fetch / _fetch_in_bulk makes its call with all of the ids (at least I think that's the issue, the response was error: UID command error: BAD [b'parse error: maximum request size exceeded'], see https://github.com/sapristi/mmuxer/issues/3 ).

I have fixed the problem by splitting the uid_list in parts:

    def _fetch_in_bulk(
        self, uid_list: Sequence[str], message_parts: str, reverse: bool
    ) -> Iterator[list]:
        from mmuxer.config_state import state

        if not uid_list:
            return
        batches = batched(uid_list, state.settings.fetch_batch_size)
        for uid_batch in batches:
            fetch_result = self.client.uid("fetch", ",".join(uid_batch), message_parts)
            check_command_status(fetch_result, MailboxFetchError)
            if not fetch_result[1] or fetch_result[1][0] is None:
                return
            for built_fetch_item in chunks((reversed if reverse else iter)(fetch_result[1]), 2):
                yield built_fetch_item

Do you think you could implement such a mechanism ? The most "complicate" part would be to provide a configurable batch size, but maybe it's not so important - using for example 100 should limit most issues, while not giving too much performance penalty.

sapristi commented 6 months ago

Just made a small benchmark on my inbox, for different batch sizes (after one run to warm up server caches):

batch size folder 1 (375msgs, 140MB) folder2 (593msgs, 89MB)
1 198s
10 55s 57s
100 50s 40s
200 52s 43s
400 59s 44s

So yeah in my case there's no point batching by more than 100 messages (the performance do seem to degrade somewhat when fetching too many messages)

ikvk commented 6 months ago

Greet, I will see it

ikvk commented 5 months ago

Check it: https://github.com/ikvk/imap_tools/releases/tag/v1.6.0 Thanks! It turned out to be a great feature.

ikvk commented 5 months ago

Small example:

from imap_tools import MailBox

with MailBox('imap.moon').login('ikvk', '123') as mailbox:
    for m in mailbox.fetch(bulk=10, mark_seen=False):
        print(m.date, m.subject)
sapristi commented 5 months ago

Thanks, I will check it out !