roundcube / roundcubemail

The Roundcube Webmail suite
https://roundcube.net
GNU General Public License v3.0
5.79k stars 1.62k forks source link

Roundcube display wrong filename for attached file #9376

Closed Michal-Zacek closed 6 months ago

Michal-Zacek commented 6 months ago

Hello, I sent email from Thunderbird with attached rdp file. The file is plain text but encoded utf-16le. Thunderbird created this headers

Content-Type: text/plain; charset=UTF-16LE; name="G011C00546.rdp" Content-Disposition: attachment; filename="G011C00546.rdp" Content-Transfer-Encoding: base64

In the Roundcube the name of the attached file is "ぇㄱぃ㔰㘴爮灤", but in the Thunderbird or Outlook it is correctly show as G011C00546.rdp. I think the Roundcube is converting the name from utf-16le to utf8, because of the charset=UTF-16LE, but the charset meant for the text in the file not the filename. I tried report it as a Thunderbird bug but they replied with: The charset is in regards to the attachment. And that is detected as utf-16. Which is true.

I will attach the rdp file and whole email in the zip archive. Thanks. Regards, Michal rdp_email.zip

alecpl commented 6 months ago

Confirmed.

The standards say that charset in Content-Type is about the body content, not the attachment name. I didn't find any RFC that would say otherwise. However, it's not uncommon to send a non-ascii message with non-ascii attachment names, and then this charset might be the only indication of a charset we have. So, I'm not sure we can just stop using it.

That said, we can do better.

  1. Check if the name is valid in the specified charset, and fallback to UTF8 if not. It would work in this case as UTF-16 is easy to validate.
  2. Use strict Content-Disposition header rules about filename as described in RFC 2183 (where no charset param exists), where the only way to specify a filename with a charset is RFC 2231 or RFC2047 syntax.

In other words, if Content-Disposition filename exists give it priority over Content-Types's name, and parse it according to RFC 2231/2047 if applicable or as ASCII (UTF8) otherwise.

alecpl commented 6 months ago

I did some more investigation, and after reading RFC2047 (and RFC2231) I come to the conclusion that:

  1. If the filename/name parameter does not use RFC2047/RFC2231 encoding we should assume it is in ASCII - and do not convert it.
  2. If a header uses encoding then the charset is specified by this encoding format - we know what to do.

So, it looks like we shouldn't be trying to be smart and do this in a stricter manner. We potentially can end up with wrong results in some rare cases, charsets that use ascii characters, but aren't ascii. Or some broken/old clients.

I think we should be more strict here and don't care about the rare cases.

alecpl commented 6 months ago

Fixed. I don't plan to backport this change to 1.6.