Sicos1977 / MSGReader

C# Outlook MSG file reader without the need for Outlook
http://sicos1977.github.io/MSGReader
MIT License
478 stars 168 forks source link

Problem with extracting body HTML from MSG file when HTML has problems #311

Closed oleksii-datsiuk closed 1 year ago

oleksii-datsiuk commented 1 year ago

Describe the bug I have MSG file which contains some formatted HTML body, which could be successfully viewed from Outlook, and also from some online mail viewer (https://www.encryptomatic.com/viewer/).

However when I extract body of this MSG with MSGReader, resulting HTML file looks empty from browser. When I check extracted HTML code, it contains some tags, but doesn't contain any text content.

I have tried to extract mail content using Outlook, and it contains text content, but there is problem with tags (one of

tags is closed after closing element).

To Reproduce I attach problematic MSG file. Also I attach HTML file extracted by MSGReader as well as HTML file extracted by Outlook.

  1. You can open MSG file with Outlook and see that it contains some formatted text.
  2. You can load it with MSGReader and save HtmlBody to the file. You will see that it contains HTML tags, but doesn't contain any text content (or just check attached HTML file).
  3. You can extract message source with Outlook and see that it contains HTML with text data, but that there is a problem with

    tag (or just check another attached HTML file).

Expected behavior I would expect MSGReader to extract corrupted HTML with text content, even if HTML is corrupted (similar to how Outlook does it).

Screenshots image data.zip

Sicos1977 commented 1 year ago

Should be fixed in nuget version 4.4.11