jstedfast / gmime

A C/C++ MIME creation and parser library with support for S/MIME, PGP, and Unix mbox spools.
GNU Lesser General Public License v2.1
113 stars 36 forks source link

Headers parser error corner case #78

Closed albrechtd closed 4 years ago

albrechtd commented 4 years ago

Hi,

I ran into a strange parser issue which can be reproduced using the attached minimal sample (which is clean and RFC-compliant IMHO), e.g. by running the msgcheck example application:

albrecht@deneb:~/OSS/gmime$ examples/msgcheck sample_parser.mbx 

(process:1907): gmime-WARNING **: 19:04:18.557: Invalid header at 4095: '
'
sample_parser.mbx: message contains 2 RFC violations:
offset 4095: [9] invalid header name, parser may skip the message or parts of it: ''
offset 4097: [9] invalid header name, parser may skip the message or parts of it: 'c29tZSBvdGhlciBkYXRhLi4uCg=='

The issue seems to be reproducible if

Apparently, the warning is produced in gmime/gmime-parser.c, function header_parse(), line 888.

Simply shifting the 2nd message part a byte forward or backward (by adding or removing a char in the 1st part) “fixes” the issue.

It might be possible that the call to _g_mime_parser_options_warn() (IIRC, it was me who added it there…😕️) in line 887 is just wrong, but at first glance this looks more like a corner case where the buffer is not filled sufficiently for parsing the headers.

Any idea what goes wrong here?

Thanks in advance, Albrecht.

P.S.: BTW, it is not possible to build gmime with the --enable-warnings configure option due to a compiler error in internet-address.c, so I had to enable the warning manually in gmime-parser.c.

sample_parser.zip

albrechtd commented 4 years ago

Unfortunately, this bug

Two POC messages are attached:

The bug is visible in Balsa, both in the master branch using GMime 2.6 (2.6.23 on my box) as well as in the gmime3 branch using ver. 3.2.4.

Looking closer at the statistics of the message scanning tool (using GMime 3.2.4) where I discovered the “invalid header name” issue it appears that the combination of the warning at a 4k boundary minus 1 byte occurs quite often.

Thus, IMHO this behaviour of the parser is a real bug which needs to be addressed in both GMime 2.6 and 3.2.

poc-missing-body-line.zip

jstedfast commented 4 years ago

Thanks for the bug report!

I think I've fixed this issue, now.

Can you check if you are getting warnings for the stuff you expect to get warnings for? That's the only piece I'm not confident in.

albrechtd commented 4 years ago

Hi Jeff, thanks a lot for fixing the parser!

I can confirm that the version as of commit 6a2de210e06bce8bafb6a00bc7e60cc697087152

I still see a warning when compiling:

gmime-parser.c: In function ‘parser_scan_mime_part_content’:
gmime-parser.c:1620:20: warning: variable ‘content_type’ set but not used [-Wunused-but-set-variable]
  GMimeContentType *content_type;
                    ^~~~~~~~~~~~
gmime-utils.c: In function ‘g_mime_utils_generate_message_id’:
gmime-utils.c:774:8: warning: unused variable ‘ascii’ [-Wunused-variable]
  char *ascii;
        ^~~~~

Compilation with --enable-warnings produces additional warnings, and then fails in internet-address.c:

In file included from /usr/include/glib-2.0/glib.h:62:0,
                 from gtrie.h:25,
                 from gtrie.c:29:
gtrie.c: In function ‘g_trie_quick_search’:
gtrie.c:358:15: warning: field precision specifier ‘.*’ expects argument of type ‘int’, but argument 4 has type ‘size_t {aka long unsigned int}’ [-Wformat=]
    g_warning ("Invalid UTF-8 in buffer '%.*s' at %.*s",
               ^
/usr/include/glib-2.0/glib/gmessages.h:336:32: note: in definition of macro ‘g_warning’
                                __VA_ARGS__)
                                ^~~~~~~~~~~
gtrie.c:358:15: warning: field precision specifier ‘.*’ expects argument of type ‘int’, but argument 6 has type ‘long int’ [-Wformat=]
    g_warning ("Invalid UTF-8 in buffer '%.*s' at %.*s",
               ^
/usr/include/glib-2.0/glib/gmessages.h:336:32: note: in definition of macro ‘g_warning’
                                __VA_ARGS__)
                                ^~~~~~~~~~~
gtrie.c: In function ‘g_trie_search’:
gtrie.c:419:15: warning: field precision specifier ‘.*’ expects argument of type ‘int’, but argument 4 has type ‘size_t {aka long unsigned int}’ [-Wformat=]
    g_warning ("Invalid UTF-8 in buffer '%.*s' at %.*s",
               ^
/usr/include/glib-2.0/glib/gmessages.h:336:32: note: in definition of macro ‘g_warning’
                                __VA_ARGS__)
                                ^~~~~~~~~~~
gtrie.c:419:15: warning: field precision specifier ‘.*’ expects argument of type ‘int’, but argument 6 has type ‘long int’ [-Wformat=]
    g_warning ("Invalid UTF-8 in buffer '%.*s' at %.*s",
               ^
/usr/include/glib-2.0/glib/gmessages.h:336:32: note: in definition of macro ‘g_warning’
                                __VA_ARGS__)
                                ^~~~~~~~~~~
[…]
gmime-utils.c: In function ‘g_mime_utils_generate_message_id’:
gmime-utils.c:774:8: warning: unused variable ‘ascii’ [-Wunused-variable]
  char *ascii;
        ^~~~~
In file included from /usr/include/glib-2.0/glib.h:62:0,
                 from gmime-utils.c:53:
gmime-utils.c: In function ‘rfc2047_decode_tokens’:
gmime-utils.c:1870:17: warning: field precision specifier ‘.*’ expects argument of type ‘int’, but argument 4 has type ‘size_t {aka long unsigned int}’ [-Wformat=]
      g_warning ("Failed to completely convert \"%.*s\" to UTF-8, display may be "
                 ^
/usr/include/glib-2.0/glib/gmessages.h:336:32: note: in definition of macro ‘g_warning’
                                __VA_ARGS__)
                                ^~~~~~~~~~~
internet-address.c: In function ‘internet_address_mailbox_get_idn_addr’:
internet-address.c:462:8: warning: unused variable ‘ascii’ [-Wunused-variable]
  char *ascii;
        ^~~~~
internet-address.c:461:11: warning: unused variable ‘encoded’ [-Wunused-variable]
  GString *encoded;
           ^~~~~~~
internet-address.c: In function ‘decode_route’:
internet-address.c:1441:72: error: ‘start’ undeclared (first use in this function)
   w(g_warning ("Invalid route domain-list, missing ':': %.*s", inptr - start, start));
                                                                        ^
internet-address.c:48:14: note: in definition of macro ‘w’
 #define w(x) x
              ^
internet-address.c:1441:5: note: in expansion of macro ‘g_warning’
   w(g_warning ("Invalid route domain-list, missing ':': %.*s", inptr - start, start));
     ^~~~~~~~~
internet-address.c:1441:72: note: each undeclared identifier is reported only once for each function it appears in
   w(g_warning ("Invalid route domain-list, missing ':': %.*s", inptr - start, start));
                                                                        ^
internet-address.c:48:14: note: in definition of macro ‘w’
 #define w(x) x
              ^
internet-address.c:1441:5: note: in expansion of macro ‘g_warning’
   w(g_warning ("Invalid route domain-list, missing ':': %.*s", inptr - start, start));
     ^~~~~~~~~
jstedfast commented 4 years ago

Thanks, I'll look into fixing the compile warnings.