Sicos1977 / MSGReader

C# Outlook MSG file reader without the need for Outlook
http://sicos1977.github.io/MSGReader
MIT License
476 stars 168 forks source link

"Received" header not parsed correctly #321

Closed nylu closed 1 year ago

nylu commented 1 year ago

MessageHeader.Received is a list: https://github.com/Sicos1977/MSGReader/blob/95ef1738bbcfa942e6177180f855bb438dadab5c/MsgReaderCore/Mime/Header/MessageHeader.cs#L80

The MessageHeader.Received is filled like this: https://github.com/Sicos1977/MSGReader/blob/d457023a2e40542d672f15c27b04b42ba218ba4b/MsgReaderCore/Mime/Header/MessageHeader.cs#L363-L367

But when parsing the transport headers, multiple occurrences of the same header value are joined by ,: https://github.com/Sicos1977/MSGReader/blob/d457023a2e40542d672f15c27b04b42ba218ba4b/MsgReaderCore/Mime/Header/HeaderExtractor.cs#L160-L165

This causes that only one Received is added to MessageHeader.Received that is a ,-delimited, incorrect value.

I am currently fixing this myself in this way:

if (msg.Headers is not null)
{
    List<Received> fixedReceived = msg.Headers.Received.SelectMany(
            brokenReceived => Regex.Split(brokenReceived.Raw, @"\s*,\s*(?=from)", RegexOptions.IgnoreCase))
        .Select(receivedStr => new Received(receivedStr))
        .ToList();
    msg.Headers.Received.Clear();
    msg.Headers.Received.AddRange(fixedReceived);
}

What should not be possible in the first place, because your API should not allow manipulating a list it provides.

My solution is not production-ready because when the header doesn't start with "from" or you change header concatenation by , it will fail also.