balagansky / BigJsonTreeViewer

A simple Windows desktop application for viewing large (100MB+) JSON files in a tree view.
MIT License
22 stars 1 forks source link

Need an option to select encoding when opening file. #1

Open jabobian opened 1 year ago

jabobian commented 1 year ago

Some files have different encoding, and as they are produced as is, so it is preferred to keep them rather than translate them first.

balagansky commented 1 year ago

Can you provide a sample file that demonstrates the unexpected behavior?

silmaril42 commented 6 months ago

I'm not sure if we really need to select the encoding, since JSON usually is UTF-8.

What we definitely need is proper UTF-8 support with correct handling of multi-byte UTF-8 characters.

Here is a small UTF-8 JSON file that contains some German umlauts and other international charracters: umlauts.json

This is what it's supposed to loke like:

{
    "ascii_only": "abcdefgABCDEFG",
    "german_umlauts": "xäxöxü xÄxÖxÜx ß",
    "other_international": "áéíóú xxx âêîôû xxx àèìòù",
    "umlauts_in_key_äöü_x": "but not in value"
}

BigJsonTreeViewer interprets this as ANSI encoding, which leads to a result that looks like this:

{
    "ascii_only": "abcdefgABCDEFG",
    "german_umlauts": "xäxöxü xÄxÖxÜx ß",
    "other_international": "áéíóú xxx âêîôû xxx àèìòù",
    "umlauts_in_key_äöü_x": "but not in value"
}
balagansky commented 6 months ago

@silmaril42, thanks for the sample!

You're right that UTF-8 should be supported as the JSON RFC requires it.

I don't know off-hand how difficult this would be to do with the MFC controls used by this viewer - hopefully / quite possibly a simple solution exists.

I also can't indicate a time frame that I would be able to look into it. But, since this is the only posted issue so far, it is automatically at the top of the to do list :)

If somebody can help, pull requests are very welcome and it would be easier for me to find time to review a PR than to look into it from scratch. Specific suggestions about how to implement a fix are also welcome.