CyberShadow / DFeed

D news aggregator, newsgroup client, web newsreader and IRC bot
http://forum.dlang.org/help#about
381 stars 35 forks source link

Incorrect RFC 1522/2047 encoding of quoted text #44

Closed schuetzm closed 9 years ago

schuetzm commented 9 years ago

Example: http://forum.dlang.org/post/movlj4$29cb$1@digitalmars.com

CyberShadow commented 9 years ago

I don't understand, what is the desired action / change in DFeed here? The Unicode characters are correctly encoded and decoded in DFeed as far as I can tell.

CyberShadow commented 9 years ago

If you mean that you see =?UTF-8?B?R3LDuHN0YWQi?= in Walter's message, that's because Walter's UA doesn't understand Unicode encoding of headers.

This encoding style is used ONLY in headers. If it is encountered in the message body, it is treated like any other sequence of characters.

CyberShadow commented 9 years ago

BTW I see no relation to the message's subject here. The subject of the message you linked is "Re: Rant after trying Rust a bit".

schuetzm commented 9 years ago

Sorry, I thought it was somehow caused by DFeed inserting the original sender into the reply without unescaping it. But you're right, Walter's using a different user agent which seems to have problems. Or, as someone else has since pointed out, Ola's newsreader wrongly uses quotes and encoded atoms at the same time. Sorry for the noise.

d-random-contributor commented 9 years ago

The original message was sent from DFeed, so headers come from DFeed? The original diagnosis was that the escaped name is incorrectly quoted, which causes Thunderbird to choke on it. If you don't like the Rust topic, here it is copied: I'd say it is a problem with the way the web interface encodes the sender name, and especially the fact that it starts with a double quote. In the message source, it looks like:

From: "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= ola.fosheim.grostad+dlang@gmail.com

According to RFC 2047 [1]: "An 'encoded-word' MUST NOT appear within a 'quoted-string'." (top of page 7), so this should be written as:

From: Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= ola.fosheim.grostad+dlang@gmail.com

    Jerome

[1] https://tools.ietf.org/html/rfc2047

CyberShadow commented 9 years ago

I am away from D stuff so haven't read the entire thread. Can you propose a pull request? The relevant code is in ae.net.ietf.message, encodeRfc1522 function.

d-random-contributor commented 9 years ago

As I understand, it's called like this: encodeRfc1522("\"Ola Fosheim Grøstad\" (email)") Should be called like this: encodeRfc1522("Ola Fosheim Grøstad (email)")

d-random-contributor commented 9 years ago
- headers["From"] = format(`"%s" <%s>`, author, authorEmail);
+ headers["From"] = format(`%s <%s>`, author, authorEmail);

Something like this.

CyberShadow commented 9 years ago

Let's try it:

https://github.com/CyberShadow/ae/commit/17bf03e6121a3338c4310ee5ad9728d60b25947d

Hopefully this will not cause issues.