Sicos1977 / MSGReader

C# Outlook MSG file reader without the need for Outlook
http://sicos1977.github.io/MSGReader
MIT License
490 stars 168 forks source link

Large memory allocated when handling relatively small MSG file #349

Closed mathy-plutoflume closed 1 year ago

mathy-plutoflume commented 1 year ago

Describe the bug I was trying to extract BodyText and BodyHtml from MSG file and noticed that the memory usage was very high.

To Reproduce I used the EmailWith2Attachments.msg from the sample test files for my analysis

Code i used to benchmark

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using MsgReader.Outlook;
using System.Text;

public class Program
{
    public static void Main(string[] args)
    {
        BenchmarkRunner.Run<ExtractMsg>();
    }
}

[MemoryDiagnoser]
public class ExtractMsg
{
    [Params(@"C:\EmailWith2Attachments.msg")]
    public string FileName { get; set; }

    public ExtractMsg() => Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);

    [Benchmark]
    public void Extract()
    {
        using (var msg = new Storage.Message(FileName))
        {
            //var txtBody = msg.BodyText;
            var htmlBody = msg.BodyHtml;
        }
    }
}

Expected behavior

Actual behavior Input msg file size is 267 KB and memory allocated for that is around 55 MB

Method FileName Mean Error StdDev Gen0 Gen1 Gen2 Allocated
ExtractMsg C:\EmailWith2Attachments.msg [74] 9,315.1 us 184.83 us 417.19 us 6078.1250 15.6250 - 55879.93 KB

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Additional context

Method FileName Mean Error StdDev Gen0 Gen1 Gen2 Allocated
Extract C:\LargeHtmlbody.msg [22] 6.161 ms 0.1224 ms 0.2581 ms 4781.2500 250.0000 234.3750 43.8 MB
Sicos1977 commented 1 year ago

Did you find out what part is consuming the memory?

dhilmathy commented 1 year ago

Did you find out what part is consuming the memory?

@Sicos1977 Not exactly. I tried removing some of the parts (specially the Attachment part) inside LoadStorage to see whether that may be cause. But could find the exact place.