Closed happybald closed 1 year ago
I know this issue since we work together. Base64 encoding during transfer doesn't play any role. The failing code can be simplified to this:
using var msg = new Storage.Message(@"c:\000-Temp\öäéàèü Test chars.msg");
Console.WriteLine(msg.Subject);
Prints: ?????? Test chars
Can you sent me this msg file? If so then please ZIP it before sending it to sicos2002@hotmail.com
The link is in the end of the issue description. But I also sent it to your emails. Thank you :)
Is there any way to workaround it ? I have an email with a pound sign in a Subject and also I experience '?' while convert such .msg file
Is there any way to workaround it ? I have an email with a pound sign in a Subject and also I experience '?' while convert such .msg file
In my case, probably file was encoded differently. I have started to play with fork of your repo and I have found that when I change to something like this:
case PropertyType.PT_STRING8:
return GetStreamAsString(containerName, Encoding.UTF7);
in my Subject pound sign is visible correctly.
Of course, it is not a solution - only a hint.
I don't know how to detect it and provide correct encoding always - no matter how .msg file was saved.
Normally an MSG file can be encoded in 2 ways; ANSI or UNICODE. If the settings is UNICODE then every string inside the MSG files has to be used as unicode. If for whatever reason somebody makes a MSG files with mixed encodings in it then is is very hard to figure out if the retrieved string is correct.
Can you sent me the MSG file that is having this issue so that I can look into it to see if there is some way in detecting the encoding for the subject? If so then please ZIP the MSG file first before sending it to sicos2002@hotmail.com
Hi. Here is a sample. Japanese ANSI and Unicode.zip
If we have Microsoft Office Outlook 2013 or such, we can switch msg export format of ANSI or Unicode by switching option: FILE
→ Options
→ Mail
→ Save messages
→ Use Unicode format
Is this the problem?
If so then this is something nobody can fix for you because Chinese is a 2 byte char set and ANSI is 1 byte. You never are going to get this to work because it is technicly not possible to do this.
The only reason why the text is readable in the body is because HTML is a 1 byte charset and does some special encoding so that the HTML render engine knows it has to show a 2 byte char.
Why do you want to use ANSII anyway? In this case unicode is invented to fix an issue like this.
Is this the problem?
Although I'm not OP, it is right.
We are just developers. This kind of problem will occur when we are going to apply msgreader against client's data through built products or software.
And I agree that detecting ANSI encoding cannot be resolved by reasonable way due to technical difficulty.
The possible way will be to open System.Text.Encoding
to developers so that they can select their own ANSI encoding in their own responsibility.
In our case the client drag and drops a meeting/email from outlook to the web solution. That creates a file in this format. We have no control over it.
In an MSG file there is a parameter that says in what format the streams are stored. If that parameters says ansii then you have to read all the streams as 1 byte encoded. There is just no way to fix the encoding issue.
I'm closing this issue because there is no propper way to detect the stream encodings if the correct enooding is not set.
Describe the bug Problem reading an Outlook file saved in ANSI
To Reproduce Steps to reproduce the behavior: Open any outlook email file with a meeting that will contain special characters such as "öäéàèü".
How I get base64 in TypeScript
How I read on C# side
Expected behavior Reads from ANSI need to be fixed
Screenshots
Desktop (please complete the following information): Processor AMD Ryzen 7 5800X 8-Core Processor Installed RAM 32.0 GB System type 64-bit operating system, x64-based processor Windows Edition Windows 11 Pro N Version 22H2 OS build 22621.1265 Experience Windows Feature Experience Pack 1000.22638.1000.0 Google Chrome Version 110.0.5481.178 (Official Build) (64-bit) Locale: LCID 1033, en-US
Additional context .NET 6, C#10, MsgReqder 4.4.16
Outlook msg file: öäéàèü Test chars.zip