BYVoid / uchardet

An encoding detector library ported from Mozilla
Other
605 stars 106 forks source link

Future roadmap? #12

Closed cicku closed 8 years ago

cicku commented 9 years ago

Hi Carbo,

It's been a long time, I'd ask for future roadmap here. Because mpv and other music players would like to support special charset via this lib, but it's not developed as far as I see from the commit.

Opinions?

cicku commented 9 years ago

Or add some people with commit access? Since I'd like to do some cleanup.

cicku commented 9 years ago

@Jehan

Jehan commented 9 years ago

It is indeed already in mpv tree and is planned to go into GtkSourceView tree (hence gedit, and probably other software using such lib) after the GNOME freeze: https://bugzilla.gnome.org/show_bug.cgi?id=669448

I understand this is not actively developed since upstream (Mozilla) does not work much on it anymore either. But at least minimum maintenance would be nice, at least until the FLOSS world gets a better solution for encoding detection (the reasoning being that after researches and tests, this Mozilla algorithm/lib still seems like the best one available). If you can't maintain this anymore, it would be cool indeed to have a few people able to review and commit patches. :-)

BYVoid commented 8 years ago

Frankly speaking, this project is not active since years ago. I am very glad to more people would like to join the maintenance of this project.

Jehan commented 8 years ago

@cicku : @BYVoid added me as co-maintainer. So I think the next move is to make a real release, just to have a hard point set in time (because it looks like uchardet never had a real release done).

I don't plan on working much on features though. I just want uchardet to be maintained enough until either we get a new person interested into improving it, or we get an alternative which works better. Unfortunately the landscape for language detection in Free Software is quite barren.

cicku commented 8 years ago

@Jehan Agree, first release should be 0.0.1 or something else larger but not too far from reality I think, code is unmaintained and may dated for a while.

Jehan commented 8 years ago

Current dev version has been named 0.0.1 since 2013 (when a README got added) without it specifying an accurate point in the commits. Consequently distributions which created a package used this version number already everywhere but with older code. For instance, debian and derivated seem to have last pulled in 2011-07-13 (if I understand well how their system works, where they use this repo), which probably corresponds to commit 56a4c0d. Fedora will apparently use the commit 84e292d, recent but not the last one. Mageia is referencing old GoogleCode repo in which I actually discover there used to be a real release called 0.0.1!

So no, I don't think the next release should be 0.0.1 because it has already been spoiled as being anything but something different for every project out there. :-)

I'm wondering if we should not jump directly to 0.1. Not that there is much difference between 0.0.1 and now but that does not bring trust up to have a dependency as 0.0.2! :P

cicku commented 8 years ago

Give 0.0.2 a chance then 0.1 if OK. :joy:

Jehan commented 8 years ago

@cicku You are right. Anyway there is so few difference, jumping to 0.1 may not make that much sense. Release v 0.0.2 created!

While browsing the code the other day, I also discovered that there is hinting in original C++ code from Mozilla, simply uchardet wrapper does not make use of it! I know some software think the lack of hinting in uchardet is a blocker (for instance VLC). This should be quite easy a feature to add to our wrapper, which could make for a 0.1 some time in the future. :-)