jstedfast / gmime

A C/C++ MIME creation and parser library with support for S/MIME, PGP, and Unix mbox spools.
GNU Lesser General Public License v2.1
113 stars 36 forks source link

Should "From " lines be escaped by MIME structure? #116

Closed bremner closed 2 years ago

bremner commented 2 years ago

Transcribing id:87h79dnwok.fsf@tethera.net, which seems to have been caught in the list's spam filter.

In notmuch we are using  g_mime_parser_set_format (parser, GMIME_FORMAT_MBOX)
to try to distinguish mboxes containing multiple messages from  those containing
only a single message [1]. The attached message breaks this, because it has
an unescaped "From " inside a (text/plain) attachment.

My question is if you consider this a gmime bug, or should mbox really
take precedence over mime structure? I guess I can see both points of
view, although it would be convenient for notmuch if gmime would
consider wrapping in mime structure as a kind of escaping.

cheers,

David

[1]: this is one of those regrettable backwards compatibility things. We
tried to get rid of it about 7 years ago, and failed.

foo.txt

jstedfast commented 2 years ago

Mbox has to take precedence over MIME structure, I think.

There are some GMimeParser hooks that can warn about various things. Maybe a warning could be added for this case and that could be used by NotMuch to decide if something truly is an mbox or not.

bremner commented 2 years ago

Jeffrey Stedfast @.***> writes:

Mbox has to take precedence over MIME structure, I think.

There are some GMimeParser hooks that can warn about various things. Maybe a warning could be added for this case and that could be used by NotMuch to decide if something truly is an mbox or not.

That could potentially be useful. What would the warning detect? "^From " at the "top level" of mime structure? Is that well defined?

d

jstedfast commented 2 years ago

I had a "shower thought" this morning and just confirmed with foo.txt, which was that if you read the first line of the input file and it matches "From " for the first 5 characters, then it's probably an mbox - otherwise it's definitely not.

That might be the best way.

bremner commented 2 years ago

Jeffrey Stedfast @.***> writes:

I had a "shower thought" this morning and just confirmed with foo.txt, which was that if you read the first line of the input file and it matches "From " for the first 5 characters, then it's probably an mbox - otherwise it's definitely not.

That might be the best way.

Yes, that's a good thought, and roughly what we do already. Unfortunately notmuch is trying to distinguish between mboxes with 1 message and those with multiple messages, and the first line is not enough information for that.

d

jstedfast commented 2 years ago

The code that saved the message to an mbox file should have munged the From to be >From.

I don't think there's an easy way to get the GMimeParser to handle this :(