BYVoid / uchardet

An encoding detector library ported from Mozilla
Other
609 stars 107 forks source link

document the difference between this and libchardet #11

Closed infinity0 closed 8 years ago

infinity0 commented 9 years ago

On reviewing bomi it looks like that uses libchardet which like this library also is based on Mozilla's code. I see that the public APIs between these two projects are different as well; however having two copies of the same code is not great for the FOSS community in general.

Could anyone give a more detailed account of the differences, and maybe merge the two libraries? For example, which version of Mozilla's code this library contains, the history of both codebases, how easy it would be to merge the two, etc.

BYVoid commented 9 years ago

Thanks. I am not familiar with libchardet, but it seems like a new project. uchardet is relatively older and not yet up to date. I will consider collaborate with libchardet.

Jehan commented 8 years ago

Hi,

I had a look, and though the main page has a 2015 copyright, the code seems to not have had any update for nearly 2 years (last commit's age is "641d 09h" at the time of writing, which — I will assume — means 641 days old). So that's not such a new project actually.

Apart from this, looking at the main page and the header (http://svn.oops.org/wsvn/OOPS.libchardet/trunk/src/chardet.h), they seem to have 2 logics to detect encoding, one with a single call (though it still needs init() and free() calls which goes against the whole point), one by feeding data with potentially several calls (same as us), though I don't really understand why they create 2 internal objects for this. They also have API to return the version of the library.

In the end, it seems similar to us, except that they don't use a clear namespaced naming (they don't prefix their functions with some recognizable pattern like chardet_ or whatever), which can be quite a problem on big problem because of name clashing.

I've added libchardet in our list of "Related Projects".

Jehan commented 8 years ago

Also I've checked their list of commits and they don't seem to have any significant fix (there was one commit saying it fixes TIS-620 but checking it contents, it just seems it was not activated on their code, that's all. Also I've created a TIS-620 test file and it works well with uchardet), or feature that we could use in uchardet.

This concludes this ticket that I will now close.