Closed DeepakBisht94 closed 1 month ago
The problem is an improperly formatted mbox file. There's no way to parse this the way you would expect because there are From
lines in the middle of the message bodies. This isn't solvable by any mbox parser.
There are a few comments that others posted which are correct - the From
lines in the message body need to be escaped or encoded:
>From topics you know about
-or-
=46rom topics you know about
If you expected the entire mbox from Thunderbird (as opposed to creating the mbox file yourself via concatenation of multiple messages with your own From
lines), then you should file a bug report against Thunderbird.
Describe the bug
I'm using MimeKit's MimeParser in C# to count the number of emails in an MBOX file. While the parser works well for most emails, I've encountered issues where certain emails are counted multiple times or split into two or three parts. Additionally, some emails trigger a "Failed to parse message headers" error, although the primary concern is the inaccurate counting and message splitting.
Here's the relevant portion of my code:
Platform (please complete the following information):
To Reproduce Steps to reproduce the behavior: Download the attached MBOX file and run the following code.
Expected behavior I'm sharing a sample email from the problematic MBOX file below, which seems to cause the parser to count it as two messages. It should be one. Also some emails giving Failed to parse message headers.
Code Snippets
Issue Files.zip