thunderbird / thunderbird-android

Thunderbird for Android – Open Source Email App for Android (fka K-9 Mail)
https://thunderbird.net/mobile
Apache License 2.0
10.94k stars 2.51k forks source link

Header decoding invents a white space #7972

Closed dilyanpalauzov closed 4 months ago

dilyanpalauzov commented 4 months ago

Checklist

App version

6.804

Where did you get the app from?

F-Droid

Android version

I do not know the version, no custom ROM

Device model

No response

Steps to reproduce

I receive two emails with these headers:

Subject: Inconsistency with SU
  =?utf-8?Q?=E2=80=9C=5B=CB=88s=CA=8Cm=C9=99=28?= =?utf-8?Q?r=29?=
  =?utf-8?Q?_ju=CB=90ni=CB=88v=C9=99?= =?utf-8?Q?=CB=90=28r=29siti?=
  =?utf-8?Q?=5D?= =?utf-8?Q?_=E2=80=94?= Deutschkurs =?utf-8?Q?2024=E2=80=9D?=

Subject: Inconsistency with SU =?utf-8?Q?=E2=80=9CDe-Tech?= 4.0 - Survive 10
  days in the greenest region of Europe without your
  =?utf-8?Q?phone!=E2=80=9D?=

Expected behavior

In both cases K9-Mail should display a single space after SU.

Actual behavior

In the first case the used white space is wider than in the second email. Evolution 3.53 does display a single space.

Logs

No response

cketti commented 4 months ago

I believe K-9 Mail's behavior is correct.

The header lines you provided use two spaces after a folding line break. However, RFC 5322, section 2.3.3 contains the following about unfolding header lines:

The process of moving from this folded multiple-line representation of a header field to its single line representation is called "unfolding". Unfolding is accomplished by simply removing any CRLF that is immediately followed by WSP. Each header field should be treated in its unfolded form for further syntactic and semantic evaluation. An unfolded header field has no length restriction and therefore may be indeterminately long.

So only the line break characters (CRLF) are removed, not any of the space characters.

The subjects also use encoded words (=?…?=), which has special treatment for whitespace. However, only between encoded words. From RFC 2047, section 6.2:

When displaying a particular header field that contains multiple 'encoded-word's, any 'linear-white-space' that separates a pair of adjacent 'encoded-word's is ignored. (This is to allow the use of multiple 'encoded-word's to represent long strings of unencoded text, without having to separate 'encoded-word's where spaces occur in the unencoded text.)

So this only ignores (multiple) spaces between encoded words, not between unencoded text and an encoded word.

To get the subject to be displayed in the desired way it has to use only one space after the folding line breaks:

Subject: Inconsistency with SU
 =?utf-8?Q?=E2=80=9C=5B=CB=88s=CA=8Cm=C9=99=28?= =?utf-8?Q?r=29?=
 =?utf-8?Q?_ju=CB=90ni=CB=88v=C9=99?= =?utf-8?Q?=CB=90=28r=29siti?=
 =?utf-8?Q?=5D?= =?utf-8?Q?_=E2=80=94?= Deutschkurs =?utf-8?Q?2024=E2=80=9D?=

Subject: Inconsistency with SU =?utf-8?Q?=E2=80=9CDe-Tech?= 4.0 - Survive 10
 days in the greenest region of Europe without your
 =?utf-8?Q?phone!=E2=80=9D?=

Side note: Thunderbird currently seems to collapse spaces when displaying the subject in the message view. This is tracked as bug 1259430.

dilyanpalauzov commented 4 months ago

I agree with you. I filled for Evolution https://gitlab.gnome.org/GNOME/evolution/-/issues/2784.