jens-maus / yam

:mailbox_with_mail: YAM (short for 'Yet Another Mailer') is a MIME-compliant open-source Internet email client written for Amiga-based computer systems (AmigaOS4, AmigaOS3, MorphOS, AROS). It supports POP3, SMTP, TLSv1/SSLv3 connection security, multiple users, multiple identities, PGPv2/v5 encryption, unlimited hierarchical folders, an ARexx interface, etc...
https://yam.ch
GNU General Public License v2.0
62 stars 18 forks source link

in html files sent as attachments, non-breaking-spaces (typed in with alt_space) are transformed #600

Open jens-maus opened 8 years ago

jens-maus commented 8 years ago

Originally by JDuch@fulladsl.be on 2015-05-12 22:09:09 +0200


Summary

Wheras Simplemail does send the same file containing non-breaking blanks undistorted, YAM2.10 Dev does ,

Steps to reproduce

1.Insert the line "NBSstart             NBSstop" in a html file

  1. attach the file to a mail addressed to you
  2. Inspect the file attached to the received mail Look at the inserted line

    Expected results

Inserted line should still be the same: "NBSstart             NBSstop"

Actual results

"NBSstart             NBSstop" when looked at with CED (in CED the blanks in the preceding line appear as squares) In IBrowse the line is presented as. "NBSstartÃfÂ,Ã, ÃfÂ,Ã, ÃfÂ,Ã, ÃfÂ,Ã, ÃfÂ,Ã, ÃfÂ,Ã, ÃfÂ,Ã, ÃfÂ,Ã, ÃfÂ,Ã, ÃfÂ,Ã, ÃfÂ,Ã, ÃfÂ,Ã, ÃfÂ,Ã, NBSstop"

Regression

Notes

tboeckel commented 8 years ago

Originally on 2015-05-13 13:15:20 +0200


Are you sure you really attached a file containing NBSP characters (0xa0)?

I just created an example text file and verified that it really contains NBSP characters and for me YAM correctly encodes it as quoted-printable and perfectly restores the file byte by byte after having received it again.

tboeckel commented 8 years ago

Originally on 2015-05-13 13:16:45 +0200


Attachment added: nbsp_test.lha (0.1 KiB) text file with NBSP characters

jens-maus commented 8 years ago

Originally by JDuch@fulladsl.be on 2015-05-14 18:55:48 +0200


I used your file "as is" and inserted in a html nbsp_test.html file, attached both to a yam mail addressed to myself, and saved the results as nbsp_test_returned.txt and nbsp_test_returned.html.

Those files are uploade in nbsp_.lha

jens-maus commented 8 years ago

Originally by JDuch@fulladsl.be on 2015-05-14 18:56:56 +0200


Attachment added: nbsp_.lha (0.3 KiB)

tboeckel commented 8 years ago

Originally on 2015-05-18 08:54:26 +0200


No problem here. Your HTML document is encoded as quoted-printable by YAM and as base64 by Thunderbird. YAM then correctly saves the attachments of itself and Thunderbird exactly byte for byte as the original file.

jens-maus commented 8 years ago

Originally by JDuch@fulladsl.be on 2015-05-18 15:41:30 +0200


Maybe because i am using Yam in French locale & with charset ISO-8859-15 ?

I noted that the raw sent message refers tot ISO-8859-1 not ISO-8859-15

----=_BOUNDARY.5efd37f068e04a7c.ed Content-Type: text/plain; charset=ISO-8859-1; name="nbsp_test.txt" Content-Disposition: attachment; filename="nbsp_test.txt"; size=17 Content-Transfer-Encoding: quoted-printable

start=A0=A0=A0=A0=A0=A0=A0=A0stop

tboeckel commented 8 years ago

Originally on 2015-05-19 08:32:54 +0200


Replying to JosDuchIt:

I noted that the raw sent message refers tot ISO-8859-1 not ISO-8859-15

You can define different charsets for GUI and for writing mails. Usually the GUI charset matches your system charset and YAM will warn you otherwise. The charset for writing mails also defaults to the system charset, but this one can be adjusted without any restrictions.

All text attachments (i.e. .txt, .html, mails, etc) will get the "write mail charset" included in the Content-Type header. Unfortunately it is impossible to correctly detect the encoding of a text file, because nobody can tell you wether a character beyond 0x80 exists because it is a german umlaut, or wether it exists because it introduces a UTF8 sequence. That's why YAM must rely on the user settings. No matter which charset you are using, YAM will never reencode the attached file. It will just declare the file to be encoded in the selected write charset.

But as you can see YAM definitely correctly attached the file without changing any on the NBSP characters:

start=A0=A0=A0=A0=A0=A0=A0=A0stop

So the final question is wether we really have a problem/bug in YAM here or whether it is just a matter of misunderstanding and possibly wrong handling of the original file by several text editors?

tboeckel commented 8 years ago

Originally on 2015-05-19 16:07:12 +0200


In (053ca93):

jens-maus commented 8 years ago

Originally by JDuch@fulladsl.be on 2015-05-19 18:17:57 +0200


Replying to tboeckel:

In (053ca93):

> * YAM_UT.c, WriteWindow.c: implemented a function to check whether a string is correctly UTF8 encoded and contains at least one UTF8 character. Based on this check the charset of text attachments is forced to either UTF8 or ISO-8859-1 instead of the configured write mail charset. This refs #600. Please note that YAM does NO reencoding, it just gives the receiver a hint how to handle the attached file.
jens-maus commented 8 years ago

Originally by JDuch@fulladsl.be on 2015-05-19 18:29:26 +0200


I think i understand now You can define different charsets for GUI and for writing mails. Usually the GUI charset matches your system charset and YAM will warn you otherwise. The charset for writing mails also defaults to the system charset, but this one can be adjusted without any restrictions.

I changet it to ISO-8859-15 too i'll report on results thanks for help

jens-maus commented 8 years ago

Originally by JDuch@fulladsl.be on 2015-05-19 20:01:52 +0200


All i can tell is that now the attaced files do show the new YAM/write USO-8859-15 charset When saved & seen in viewer, editor or Browser the result is unchange I guess the ticket #600 wil take care ofit ----=_BOUNDARY.5e3e94806051e251.e2 Content-Type: text/plain; charset=ISO-8859-15; name="nbsp_test.txt" Content-Disposition: attachment; filename="nbsp_test.txt"; size=17 Content-Transfer-Encoding: quoted-printable

start=A0=A0=A0=A0=A0=A0=A0=A0stop

tboeckel commented 8 years ago

Originally on 2015-05-21 07:28:17 +0200


I really have to ask again: do we still have a problem here or is this issue invalid? You never provided your original files, but just the received version of my files. Reading to you latest answer makes me think that the issue is solved. It would be nice if you would either confirm this or provide all necessary information to let me reproduce the problems you are facing.

jens-maus commented 8 years ago

Originally by JDuch@fulladsl.be on 2015-05-21 15:13:47 +0200


We really still have the same problem. Even after synchronising all char settings to ISO-8859-15 i reported : "same result" I tested using your original files & of course supposed only how YAM returned & saved the returned file was of interest to you.

In fact the test is still simpler:

In my last reaction, i interpreted this text

 YAM_UT.c, WriteWindow.c: implemented a function to check whether a string is correctly UTF8 encoded and contains at least one UTF8 character. Based on this check the charset of text attachments is forced to either UTF8 or ISO-8859-1 instead of the configured write mail charset. This refs #600. Please note that YAM does NO reencoding, it just gives the receiver a hint how to handle the attached file.

in the following way: i concluded that you did identify the origin of the problem, namely that presently " the configured write mail charset" is not respected (ISO8859-1 instead of ISO--8859-15)

& i expressed the hope that you (ticket #600) would fix it.

Of course i can not be sure to be complete about what you need to reproduce the problem, as far as i am concerned, i don't know how to avoid it in these circumstances:

In fact i just changed all these to ISO-8859-1 & redid the test describe in this last post1)2)3): same result:

Visually the files are changed& contain the line "start        stop"

I am very puzzled if i would be the only one experiencing this.

Action ?