sonicdoe / detect-character-encoding

Detect character encoding using ICU
Other
84 stars 15 forks source link

Add support for use in browser #4

Closed treyhunner closed 8 years ago

treyhunner commented 8 years ago

If this worked client-side, the File API could be used to read a file locally and detect everything locally which would avoid the need to upload anything to a server.

It looks like this relies on C++ code. Any chance this could be turned into a fully JavaScript-powered library in the future?

sonicdoe commented 8 years ago

My main goal with this library was to take advantage of ICU because it seems to be the most widely tested charset detector.

I don’t think there’s an easy way to make ICU’s detector work client-side, so I’d suggest using a simpler charset detector, such as your own. Would this work for you?

treyhunner commented 8 years ago

I understand. Mine isn't sophisticated enough. :disappointed:

FYI I found jschardet which claims to be a port of Python's chardet which also detects character encodings. I haven't played with it yet but it should work in browser. Thanks!