TextAnalysisTool / Releases

Repository for storing release artifacts (ex: binaries).
88 stars 26 forks source link

Not Support UTF-8 Chinese #42

Closed Zeping-Jian closed 7 years ago

Zeping-Jian commented 7 years ago

Can TextTools Support UTF-8? thanks a lot. now just support GBK notsupportutf-8

DavidAnson commented 7 years ago

Yes, TextAnalysisTool.NET supports UTF-8.

Does a file with bad rendering look good in Notepad? If so, does changing the font used by TextAnalysisTool help? (Some fonts don't include all characters and don’t fall back to good options.)

Can you please provide a file or text that is not being rendered correctly? (Either here or a URL.) Also, can you provide an image showing what you expect it to look like?

Thank you for your help!

Zeping-Jian commented 7 years ago

Text not rendered correctly is below. android_log_for_TextTools.txt

We want look this below ,thank you very much. textanalysistool-help-ultraedit

changing the font , not work. textanalysistool-help

DavidAnson commented 7 years ago

The file you link to does not have a UTF-8 Byte Order Mark: https://en.wikipedia.org/wiki/Byte_order_mark

.NET doesn't do well auto-detecting the UTF-8 encoding without that.

If I open the file you link to in Windows Notepad, then File, Save As, "WithBom.txt", Encoding="UTF-8", and open the new file in TextAnalysisTool.NET, the rendering looks the same as you show above in UltraEdit.

This is on an EN-US Windows 10 OS without any localization packs installed, so the default font of "Courier New" is able to display the relevant Chinese characters.

If you're able to change the thing that creates your files to include the 3 bytes 0xEF,0xBB,0xBF at the beginning, you should be able to open them directly in TextAnalysisTool.NET.

Hope this helps!

Zeping-Jian commented 7 years ago

Thank you a lot. solve problems with patience. Can TextAnalysisTool Open Source ?

DavidAnson commented 7 years ago

Great news!

And sorry that I'm not able to open-source the code at this time.