Closed summerkitsune closed 5 months ago
Did you register the encoding providers?
This is something that works by default on the .net framework but is something you have to do yourself on other frameworks because they are cross platform.
Thanks for your prompt answer :) I didn't register the encoding providers, but I think I tested doing that and that it didn't work
I tried it again just now:
I then cleaned my solution and rebuilt, and then ran it again:
I never used MSGReader in .net8 so I'm a little bit blank about how to solve problems on this version of .net
You could try looking into this file --> https://github.com/Sicos1977/MSGReader/blob/master/MsgReaderCore/Rtf/Document.cs
And set a break-point on this line --> if (byteBuffer.Count > 0 && reader.TokenType != TokenType.EncodedChar)
And than look if the chars get decoded correctly
Thanks, I will do that ^^
And did you solve it?
Hi again, yes I found out why I am having this issue
Encoding.Default is Encoding.UTF8 or Encoding.Unicode in NET Core/3/4/5/6/7/8. In .NET Framework, Encoding.Default is the system's active code page. More info
In .NET 8, It decoded badly the message subject of ANSI messages because it was no longer using the system's active code page by default, but trying to directly decode using Encoding.Unicode. Decoding ANSI to Unicode leads to have these unrecognized characters in the string
I have forked your project, and added a fix on a branch created on the (commit) flag 5.5.5 (last release) You can find the proposed commit here
I have added 2 unit tests that cover this and ran successfully all other tests (in net462 and net8)
Nice you found it out... I also did not knew that Encoding.Default in net core defaults to UT8
I'm not sure 100% if the way I fixed this is the best way to do it, probably .NET Framework 4.x users don't need this fix, also this fix depends on UTF.Unknown to detect the encoding so it is possible we could break .NET Framework users experience by trying to decode by default with that library
I thought that we could maybe wrap the "detection" part with a preprocessor directive (only doing it for 'netstandard' users)
Is it possible for you to sent me this msg file so that I can look into it to figure out a solution that works in both .net framework and newer versions? If so than please ZIP the MSG file before sending it to sicos2002@hotmail.com
Hi, I sent you an email with what you asked
Thanks... I'll will look into it an try to figure out a solution that will work on all .net versions
Hi Kees, have you had time to figure out a solution?
Hi Kees, have you had time to figure out a solution?
Sorry but totally forgot this issue... to busy with another project at the moment. I'll give it a new try this weekend.
Hi Kees, have you had the occasion to give it a try?
We are facing the same issue after upgrading from .NET6 to .NET8. After the upgrade some words in Greek are not decoded correctly.
This should be fixed in version 5.5.8, it is now using the MessageCodePage property to determine the used coding. It should probably always have been like this.
Sorry for the long delay but I was to busy with other things and kept finding reasons to ignore this problem :-)
@Sicos1977 Our scenario is still not working. We have an email in msg format which contains greek characters in the body. We are retrieving the body through the BodyHtml property and the encoding for most words is wrong. I could email you the sample msg file if you can take a look into this.
This is probably another encoding issue that has nothing todo with the issue of the original poster because that issue is solved. I guess this is some kind of encoding problems that happens when the HTML is extracted from RTF but to be sure I have to see this message. If possible then ZIP the msg file and sent it to sicos2002@hotmail.com
Describe the bug When you read an .msg Subject with MSGReader in a.NET 8 project, it fails to properly display special characters, and that is not what happens in the .NET Framework 4.7.2 project. In the latter, it works. Same MSGReader version.
To Reproduce I created 2 repositories that contain basically the same code (and same dependencies) - the only difference between these 2 repositories being the .NET versions. The version of MSGReader is the same
Links to the repositories:
The 2 repositories use the same .msg file (same md5 checksums). You can find the .msg file in the repositories.
The .msg file has been created like this:
Expected behavior I expected the special characters to be displayed properly, like in the .NET Framework 4.7.2 project.
Screenshots
Desktop
I have this problem on my machine but I've also tried on another machine (still Windows) with the same results.
Additional context I encountered this problem while migrating a project from .NET Framework 4.7.2 to .NET 8.