RealRaven2000 / FiltaQuilla

Adds many new mail filter actions to Thunderbird
http://quickfilters.quickfolders.org/filtaquilla.html
GNU General Public License v3.0
88 stars 17 forks source link

Save email as file: Incorrect parsing of subject creates garbled file name #53

Open LittleAlf opened 4 years ago

LittleAlf commented 4 years ago

First of all, thanks for your great work of transferring this wonderful addon to Thunderbird 68+! I really appreciate your work of keeping this addon alive, because I use it regularly on quite a number of emails.

Here is what I found: I have several regular emails with logging data and standard subjects. For the first time I would like to save them automatically via FiltaQuilla. As I noticed, the file name of the saved .eml-file differs, if I save it manually by "Save as ...", and if I use the "Save as file" filter in FiltaQuilla!

The reason is, that via FiltaQuilla the function _sanitizeName(aName) is applied, which basically removes all special characters, and does not know a UTF8 encoding.

An example:

The reason for this drastic difference is the partly UTF8 encoding in the subject line: Subject: Solar.web Report =?utf-8?b?ZsO8cg==?= PV-Anlage Muxx (18.01.2020)

Is it possible to use a less drastic sanitizing method for the file name of the saved email message? Something along the line of the manual saving?

Best regards LittleAlf

raspopov commented 3 years ago

Indeed. Using "Save Message As File" is almost unusable in any other language except English, because of UTF-8 commonly encoded subjects in other languages (Russian included). For example:

Subject: =?utf-8?b?0JrQsNGB0YHQvtCy0YvQuSDRh9C10Log0L7RgiDQntCe0J4=?=
 =?utf-8?b?ICLQm9Cw0LHQuNGA0LjQvdGCLtCg0KMi?= 07.08.2021

Should be decoded as:

Кассовый чек от ООО "Лабиринт.РУ" 07.08.2021

but in current version as:

utf-8b0jrqsngb0yhqvtcy0yvqusdrh9c10log0l7rgidqntce0j4-utf-8b

i.e. un-decoded and without second line.


P.S. Great extension, I supported it by quickFilters Pro registration!

RealRaven2000 commented 3 years ago

Just working on this - the first part (decoding the MIME charset) was relatively easy using MailServices.mimeConverter.decodeMimeHeader() but the problem is that there was a sanitization function that removed all non-arabic characters. The only built in function I found so far that does this is in C++ and thus cannot be scripted via JavaScript:

https://searchfox.org/comm-central/source/mailnews/base/src/nsMessenger.cpp#111

I am not too sure what kind of restrictions apply nowadays for file names, more research is necessary.

RealRaven2000 commented 3 years ago

I changed the algorithm and only filter out certain characters instead. 1) Replacing the following characters with - : @ | / \ 2) Completely Removing the following characters: $ % " < > ,

I tested with the subject [filtaquillasaveas]-кассовый-чек-от-ооо-лабиринт.ру-07.08.2021. The string will be decoded after 60 characters and then the extension ".eml " appended: the file created was named [filtaquilla-saveas]-кассовый-чек-от-ооо-лабиринт.ру-07.08.2.eml

Test version: filtaquilla-3.2pre72.zip

nomeata commented 3 years ago

These changes made it into the 3.2 already, right? This broke my brother’s setup, because some of the mails he is saving using your plugin have colons (:) in the subject, which Windows doesn't like at all…

Could you add : to the first list of characters (those that you replace with -) for the next release? Thanks!

RealRaven2000 commented 3 years ago

These changes made it into the 3.2 already, right? This broke my brother’s setup, because some of the mails he is saving using your plugin have colons (:) in the subject, which Windows doesn't like at all…

Could you add : to the first list of characters (those that you replace with -) for the next release? Thanks!

yeah, that makes total sense. Somehow I overlooked these. Try this version please?

filtaquilla-3.2.1pre3.zip

nomeata commented 3 years ago

That works, thanks for the quick fix! :-)