npp-plugins / mimetools

Other
43 stars 11 forks source link

Quoted-printable Encode fails #15

Closed kr580vm80a closed 6 months ago

kr580vm80a commented 2 years ago

I have the word 'банк' (bank) in Ukrainian. When I try encode this word I get a result '=D0=B1=D0=B0=D0=ㄽD0=BA'. But correct result is '=D0=B1=D0=B0=D0=BD=D0=BA'

Fix please.

kr580vm80a commented 2 years ago

There is one more problem: '=D1=80=D0=B5=D0=B3=D0=B8=D1=81=D1=82=D1=80=D0=B8=D1=80=D0=BE=D0=B2=D0=B0=D0=BB=D0=B8=D1=81=D1=8C' NOT decoded. Must be 'регистрировались' in Russian.

Fix please.

donho commented 6 months ago

Wrong encoded result of банк is indeed a bug.

However, the following is not a bug:

'=D1=80=D0=B5=D0=B3=D0=B8=D1=81=D1=82=D1=80=D0=B8=D1=80=D0=BE=D0=B2=D0=B0=D0=BB=D0=B8=D1=81=D1=8C' NOT decoded.

Here are the QP 76 length limit explanation from the Wikipedia: "QP works by using the equals sign = as an escape character. It also limits line length to 76, as some software has limits on line length."

Lines of Quoted-Printable encoded data must not be longer than 76 characters. To satisfy this requirement without altering the encoded text, soft line breaks may be added as desired. A soft line break consists of an = at the end of an encoded line, and does not appear as a line break in the decoded text. These soft line breaks also allow encoding text without line breaks (or containing very long lines) for an environment where line size is limited, such as the 1000 characters per line limit of some SMTP software, as allowed by RFC 2821.

(ref: https://en.wikipedia.org/wiki/Quoted-printable)

So регистрировались should be encoded into:

=D1=80=D0=B5=D0=B3=D0=B8=D1=81=D1=82=D1=80=D0=B8=D1=80=D0=BE=D0=B2=D0=B0=D0=
=BB=D0=B8=D1=81=D1=8C

(with the =EOL to separate the new line) instead of

=D1=80=D0=B5=D0=B3=D0=B8=D1=81=D1=82=D1=80=D0=B8=D1=80=D0=BE=D0=B2=D0=B0=D0=BB=D0=B8=D1=81=D1=8C

(in one line)