jstedfast / gmime

A C/C++ MIME creation and parser library with support for S/MIME, PGP, and Unix mbox spools.
GNU Lesser General Public License v2.1
118 stars 38 forks source link

Don't canonicalise host names in Content-Ids values #38

Closed mjog closed 6 years ago

mjog commented 6 years ago

Sorry to basically re-open Bug 762782, but there's a pretty good argument for not canonicalising the URI in a Content-Id: Since the message body itself refers to the CID URLs as-is in IMG SRC values, it breaks not only displaying the message, but also forwarding it, and there's no way for the application to work around it other than having to parse the HTML parts, guessing what GMime has done to the CID, and altering the message body to fix it.

It's kind of like a web browser canonicalising outgoing requests for URLs with trailing slashes by removing them, e.g. sending a GET for http://foo/bar when the user clicks on a link with http://foo/bar/ as the HREF. For the HTTP URL scheme these refer to the same resource, and it's shorter and arguably "more canonical" to remove the trailing slash, but it would break a whole lot of web sites.

I'm all for canonicalising the Content-ID header value, but leaving the URL alone so it matches any CID URLs in the message body is definitely the way to go.

jstedfast commented 6 years ago

I'm sorry but you are wrong.

Thunderbird does things the way you suggest and they are broken: https://bugzilla.mozilla.org/show_bug.cgi?id=1149663

You HAVE TO canconicalize the cid's or you are bound to fail.

jstedfast commented 6 years ago

FWIW, you can get the raw Content-Id value by requesting the raw header in GMime.

mjog commented 6 years ago

I'm not saying don't canonicalise the CIDs, I'm saying don't canonicalise the CID URLs. Big difference.

mjog commented 6 years ago

Eh, to be more specific: "Don't canonicalise the RFC 5322 obs-id-right part of a Message ID part of a Content ID header value".

The trailing "." is allowed by the text of RFC 5322 3.4.1, so since it is found in the wild and breaks API users, it should be left alone despite being obsolete.