djcb / mu

maildir indexer/searcher + emacs mail client + guile bindings
http://www.djcbsoftware.nl/code/mu
GNU General Public License v3.0
1.59k stars 384 forks source link

[mu rfe] index arbitrary headers #2705

Closed fbenkstein closed 1 month ago

fbenkstein commented 2 months ago

Is your feature request related to a problem? Please describe.

At work I get a huge number of emails per day different, internal systems. Many of them are not addressed to me directly but I can't turn them off. I use a script that combines mu find and mu move in combination with mbsync to categorize these, delete most and keep the ones that are important. It mostly works fine using from / to / subject but unfortunately, some are hard to categorize correctly with just that data. A big chunk of these emails have additional, custom headers but no other obvious keyword to look for in subject / from / to. I would like to search for these header values directly in searches. I previously was using imapfilter but unfortunately, it's too slow for my use case. I did however allow me to search by custom headers which made it quite accurate.

Describe the solution you'd like

Ability to index and search for arbitrary, user-defined headers.

I'm not sure what the user surface of this feature would look like. Ideally, there would be some config file that defines the headers <-> field mappings. Absent that, it could be a flag passed to mu index on every invocation. I guess the information would have to be passed to mu find as well - or could it discover the fields somehow? Or maybe there could just be a two special fields that is used for all custom headers, one boolean that marks presence of a header and one string-field that contains the full header name for every record. Not sure if this would scale or be very performant.

Describe alternatives you've considered

I could use mu find -f l to do some pre-filtering and then use some other tool (grep?) for further filtering. I don't want to do this because it's unergonomic (hard to test rules from the command line, two different search syntaxes), difficult to make work correctly (e.g. multi-line header values, header keywords appearing elsewhere in the email) and possibly quite slow.

Alternatively, maybe there is a way to do this already or at least in a more integrated fashion with the guile API? I'd have to learn a new programming language just for that, though.

Additional context

djcb commented 2 months ago

Using mu find to query arbitrary headers is not possible, and I don't expect it ever will. With that out of the way, what are the alternatives?

I wouldn't recommend Guile scripting, as it can't really help you with moving messages.

But a little shell-scripting (as you mentioned) can do the trick and tools like formail can help. You can search messages, i.e., add some script move-message-maybe.sh for some special header Foo:

#!/bin/sh
msgpath=$1
foo=$(formail -x foo < ${msgpath})
# add logic to mu-move based on foo 

and periodically do e.g.

mu find maildir:/inbox --exec move-message-maybe.sh

Alternatively, perhaps you can use procmail which can automatically move incoming (or any, really) mail to arbitrary folders; that is what I use myself (with fetchmail).

djcb commented 1 month ago

So.... closing this. Hopefully the above can help you to get what you need -- thanks!

fbenkstein commented 1 month ago

That's too bad. If I was willing to implement this myself, would you be willing to reconsider- assuming code quality is okay and it's not too hacky or intrusive?

As a data point on why this can be useful to others, the email for the above message has these headers:

X-GitHub-Sender: djcb
X-GitHub-Recipient: fbenkstein
X-GitHub-Reason: author

The Gerrit code review tool does something similar: itt sets headers like Gerrit-Project or Gerrit-MessageType which can be very useful for filtering emails from projects with lots of traffic. See ChangeEmailImpl.java.

djcb commented 1 month ago

Yeah, the set of headers to index is more-or-less set in stone, don't want to change that.

But what you could do is use some tool on your incoming mail (procmail or something else), and add e.g. an "X-Label" header with whatever you want to search; that is available through a "tags" search. Not quite what you want but perhaps it's good enough.