jens-maus / yam

:mailbox_with_mail: YAM (short for 'Yet Another Mailer') is a MIME-compliant open-source Internet email client written for Amiga-based computer systems (AmigaOS4, AmigaOS3, MorphOS, AROS). It supports POP3, SMTP, TLSv1/SSLv3 connection security, multiple users, multiple identities, PGPv2/v5 encryption, unlimited hierarchical folders, an ARexx interface, etc...
https://yam.ch
GNU General Public License v2.0
63 stars 18 forks source link

Handling of charsets (e.g. UTF-8) when using external editor. #7

Closed jens-maus closed 8 years ago

jens-maus commented 8 years ago

Originally by emptystate@yahoo.co.uk on 2010-03-03 15:10:34 +0100


When receiving an email with a charset that is not supported by the font system (I think, I don't know how YAM is detecting this charset), YAM shows a question mark for every character that is not standard ASCII. When YAM passes this mail to an external program or saves as a file it still replaces the characters with question marks. YAM should not replace these characters in these situations.

For example, my friend recently sent me an email in UTF-8 with an attachment. Their MUA decided to package the text as base64. In YAM when I read this mail I get "????? ?? ????" etc in the message window. If I save the displayed message I get the same "????? ?? ????" etc, if I save the raw message I get base64. If I try to reply and call an external editor that can handle the character coding, YAM passes it the "????? ?? ????" text instead of the original text.

Just FYI, this did not occur in YAM 2.4.

jens-maus commented 8 years ago

Originally by hill28 on 2010-03-03 15:11:19 +0100


I think I see what's happening: codesets.library sees that the text cannot be converted into something compatible with my locale.

If that's the case then this is a big problem. I don't want to switch my system locale every time I get an email in a different language. Also, I can't see MorphOS or AmigaOS supporting all languages any time soon, the only way I can access the text will be to save it as raw from YAM, base64 decode it and then load into an appropriate viewer.

Is it possible to ask codesets.library if it can convert the text or not; if not, give the user the option of saving the mail in the original coding (i.e. base64 decoded, headers removed etc) ?

jens-maus commented 8 years ago

Originally on 2010-03-07 14:05:08 +0100


Can you please supply the mail in question that is base64 encoded and causes the issue to appear so that we can reproduce the issue on our own machine and think about a possible solution?

jens-maus commented 8 years ago

Originally by emptystate@yahoo.co.uk on 2010-03-18 08:08:55 +0100


I was the reporter of this bug.

I just checked the behaviour of this problem in YAM 2.6 and I need to revise the description a little.

  1. I receive an email in UTF-8, encoded as Base64.
  2. YAM displays ????? for glyphs in the message (glyphs for this coding are not available so this is OK). 3a. Saving raw text results in Base64 plus headers, as expected. 3b. Saving the displayed text results in a "???? ????" type message (not UTF-8). 3c. If I hit reply and then call an external editor, the text passed is the "???? ????" type message.

When I receive a message coded in this style I have to save in raw format, edit to remove the email headers and decode the Base64. This is functionality already existing in YAM.

I think the correct behaviour should be:

3b. YAM decodes the message from Base64 and saves UTF-8. 3c. YAM passes the UTF-8 message to the external editor instead of the processed version.

I am sure you can easily manufacture your own test message, but let me know if you still need test data.

jens-maus commented 8 years ago

Originally on 2010-03-18 08:16:32 +0100


Ah ok. now I get it. So the problem is that YAM does not pass plainly pass the UTF-8 mail to an external editor. However, we have to find a general solution for that because there might be users without a UTF-8 capable editor and thus they don't want to have 3-byte characters in their external editors. So we have to think about it once more. But I admit that the current solution is suboptimal.

jens-maus commented 8 years ago

Originally by emptystate@yahoo.co.uk on 2010-03-18 10:50:15 +0100


thus they don't want to have 3-byte characters in their external editors

UTF-8 encoded text may have 1, 2, 3 or 4 bytes per character.

there might be users without a UTF-8 capable editor

How can YAM know what software I have and what it is capable of ? Surely external software will have to decide for itself if it can handle the input ?

jens-maus commented 8 years ago

Originally on 2010-03-18 10:52:29 +0100


Replying to emptystate@…:

there might be users without a UTF-8 capable editor

How can YAM know what software I have and what it is capable of ? Surely external software will have to decide for itself if it can handle the input ?

YAM cannot know that. That's why we have to think carefully about that and eventually introduce an option or possibility to enable/disable the new functionality as soon as we have implemented that.

jens-maus commented 8 years ago

Originally on 2013-08-19 22:33:41 +0200


In (282f4f9):

jens-maus commented 8 years ago

Originally on 2013-08-24 00:40:51 +0200


In (5c05993):

jens-maus commented 8 years ago

Originally on 2013-08-26 01:37:59 +0200


In (43172a9):

jens-maus commented 8 years ago

Originally on 2013-12-22 18:55:18 +0100


Release Note: improved charset/codeset handling support when using external editors aware of UTF-8