Open giachello opened 2 months ago
Hi Giovanni,
Although adding a transparent support for \r\n (CRLF) end-of-line sequences seems like a simple solution, the consequences could be non-trivial.
First, it would be a violation of RFC 4155, but this is the lesser concern.
Second, suppose we search mailbox A with Windows-style \r\n end-of-line sequences, and want to append the output to a non-empty mailbox B with Unix-style \n end-of-line sequences (or vice versa). This will corrupt mailbox B.
So I think this question requires a bit more careful consideration. We shouldn't make assumptions only based on the platform here.
May I ask which piece of software created those mailboxes you were testing with?
Best regards, Daniel
Hi there, this is interesting. These mbox files were created by Google Gmail's Takeout process. I checked by unzipping using unzip -b and the files use \r\n at the origin! My main use for mboxgrep is to split Takeout files in Categories.
maybe we can turn this into an option , similar to unzip -a
Hi Giovanni,
I think your proposal makes sense. We can add an option to tolerate \r\n in the input, but force the correct format in the output.
/Daniel
in mbox.c:302 the code fails with windows text files that have /r/n as endline
Just adding a condition where you test for /n or /r/n fixes the issue.