Describe the bug
I have MSG file which contains some formatted HTML body, which could be successfully viewed from Outlook, and also from some online mail viewer (https://www.encryptomatic.com/viewer/).
However when I extract body of this MSG with MSGReader, resulting HTML file looks empty from browser. When I check extracted HTML code, it contains some tags, but doesn't contain any text content.
I have tried to extract mail content using Outlook, and it contains text content, but there is problem with tags (one of
tags is closed after closing element).
To Reproduce
I attach problematic MSG file. Also I attach HTML file extracted by MSGReader as well as HTML file extracted by Outlook.
You can open MSG file with Outlook and see that it contains some formatted text.
You can load it with MSGReader and save HtmlBody to the file. You will see that it contains HTML tags, but doesn't contain any text content (or just check attached HTML file).
You can extract message source with Outlook and see that it contains HTML with text data, but that there is a problem with
tag (or just check another attached HTML file).
Expected behavior
I would expect MSGReader to extract corrupted HTML with text content, even if HTML is corrupted (similar to how Outlook does it).
Describe the bug I have MSG file which contains some formatted HTML body, which could be successfully viewed from Outlook, and also from some online mail viewer (https://www.encryptomatic.com/viewer/).
However when I extract body of this MSG with MSGReader, resulting HTML file looks empty from browser. When I check extracted HTML code, it contains some tags, but doesn't contain any text content.
I have tried to extract mail content using Outlook, and it contains text content, but there is problem with tags (one of
tags is closed after closing element).
To Reproduce I attach problematic MSG file. Also I attach HTML file extracted by MSGReader as well as HTML file extracted by Outlook.
tag (or just check another attached HTML file).
Expected behavior I would expect MSGReader to extract corrupted HTML with text content, even if HTML is corrupted (similar to how Outlook does it).
Screenshots data.zip