jonaswinkler / paperless-ng

A supercharged version of paperless: scan, index and archive all your physical documents
https://paperless-ng.readthedocs.io/en/latest/
GNU General Public License v3.0
5.37k stars 355 forks source link

Mail filter error when using "special characters" #305

Closed Philmo67 closed 3 years ago

Philmo67 commented 3 years ago

Hello, I encounter errors when subject filters in the email consumption rules contain special characters like an "é":

09/01/2021 12:40 ERROR Rule Compte.Facture Marchand: Error while processing rule: 'ascii' codec can't encode character '\xe9' in position 67: ordinal not in range(128)

09/01/2021 12:40 DEBUG Rule Compte.Facture Marchand: Searching folder with criteria (SINCE 25-Aug-1993 FROM "cgy-serviceclient@marchand.com" SUBJECT "Expédition de votre commande" UNFLAGGED)

09/01/2021 12:40 DEBUG Rule Compte.Facture Marchand: Selecting folder INBOX 

--> if I remove "Expédition" from the filter, everything works fine. It seems that unicode should be used somewhere instead of ascii

jonaswinkler commented 3 years ago

The library defaults to ascii for IMAP query criterias for some reason. I'll look into why that is and use utf-8 instead.

jonaswinkler commented 3 years ago

Alright, without doing too much more digging, this is the current state:

The fix will stay, since it does not break anything; However, I'm unable to provide any more fixes for this.

jonaswinkler commented 3 years ago

See https://github.com/ikvk/imap_tools/issues/88

jonaswinkler commented 3 years ago

This is definitely an issue of the gmail servers not adhering entirely to the IMAP specifications regarding the SEARCH command.