afewmail / afew

an initial tagging script for notmuch mail
ISC License
325 stars 99 forks source link

kill filter could be optimized #319

Open anarcat opened 2 years ago

anarcat commented 2 years ago

running the kill filter with --verbose --verbose, I noticed the following:

DEBUG:root:Executing query '(tag:feeds and tag:hackaday) AND (NOT tag:killed)'
DEBUG:root:Executing query 'thread:"0000000000016f33" AND tag:killed'
DEBUG:root:Executing query 'thread:"0000000000016f32" AND tag:killed'
DEBUG:root:Executing query 'thread:"0000000000016f31" AND tag:killed'
DEBUG:root:Executing query 'thread:"0000000000016f30" AND tag:killed'
DEBUG:root:Executing query 'thread:"0000000000016f2f" AND tag:killed'
[...]

.. ad nauseum. that feels really slow! here I have this simple rule:

notmuch tag +killed -- "thread:{tag:killed}" and not tag:inbox

(the not tag:inbox is your not tag:new). it seems to me that would work as well and wouldn't need to enumerate every thread in the query?

GuillaumeSeren commented 2 years ago

Hey @anarcat , please give more details, because I am not sure of what is wrong here.

(the not tag:inbox is your not tag:new). it seems to me that would work as well and wouldn't need to enumerate every thread in the query?

The way filters works in afew is to get a list of threads or mails and loop on them to push the tags / stuff you want. Feel free to detail how you would do it maybe we can figure out somthing better.

anarcat commented 2 years ago

The way filters works in afew is to get a list of threads or mails and loop on them to push the tags / stuff you want.

Yes, that is the part I am saying is slow. :)

Feel free to detail how you would do it maybe we can figure out somthing better.

This query is faster, and tags all affected threads at once, without having to iterate over all threads:

notmuch tag +killed -- "thread:{tag:killed}" and not tag:inbox
GuillaumeSeren commented 2 years ago

Hey @anarcat Yes sure this would be faster, maybe we could upgrade the query, I am little afraid that this end tagging way more than the previous, but it may be the same.

Please open a pull-request with this.

somini commented 2 years ago

From the notmuch docs (https://notmuchmail.org/doc/latest/man7/notmuch-search-terms.html):

The performance of such queries can vary wildly. To understand this, the user should think of the query thread:{} as expanding to all of the thread IDs which match ; notmuch then performs a second search using the expanded query.

So it should be OK either way.

mjg commented 1 year ago

From the notmuch docs (https://notmuchmail.org/doc/latest/man7/notmuch-search-terms.html):

The performance of such queries can vary wildly. To understand this, the user should think of the query thread:{} as expanding to all of the thread IDs which match ; notmuch then performs a second search using the expanded query.

So it should be OK either way.

While that is true, xapian queries are fast. Notmuch "expands" that query in pure C, whereas afew loops in python and goes through the python API to notmuch each time. That does make quite a difference - it should be okay for a limited number of "new" messages, and is probably quite slow on your full mail db.