nortonandrews / kikoeru

Self-hosted web media player for DLsite voice works.
GNU General Public License v3.0
140 stars 15 forks source link

hashNameIntoInt() hash collision #19

Open umonaca opened 3 years ago

umonaca commented 3 years ago

hashNameIntoInt('かの仔') returns 12285. hashNameIntoInt('こっこ') returns 12285 as well. Thus the two Voice Actors are incorrectly identified as the same person.

NANELLON commented 3 years ago

Suggest removing https://github.com/nortonandrews/kikoeru/blob/3a190b729ed03a212cd24711ccb3adcfec53a443/src/server/hvdb.js#L20 as it is just truncating 3 digits off the end of the hash which is causing collisions. Other option: just get resh to expose the id, would be a small change on his end.

Unfortunately whatever option is used it will require either users to delete their sqlite database and do a full re-scan or add some kind of migration that just rehashes CV names when a user updates to the latest version.

umonaca commented 3 years ago

Suggest removing

https://github.com/nortonandrews/kikoeru/blob/3a190b729ed03a212cd24711ccb3adcfec53a443/src/server/hvdb.js#L20

as it is just truncating 3 digits off the end of the hash which is causing collisions. Other option: just get resh to expose the id, would be a small change on his end. Unfortunately whatever option is used it will require either users to delete their sqlite database and do a full re-scan or add some kind of migration that just rehashes CV names when a user updates to the latest version.

You may want to have a look at my repo:
https://github.com/kikoeru-project/kikoeru-express
It is a rework of nortonandrews/kikoeru and it is fully backward compatible and well maintained.
Besides, the corresponding frontend is https://github.com/kikoeru-project/kikoeru-quasar, which is made with Vue.js and Quasar framework.
The frontend is packaged with backend into single executables which support Windows, Linux, MacOS. The program is also available on DockerHub and the Docker image supports AMD64/ARM64/ARMv7 architectures.
With kikoeru-project you can scan and refresh metadata of all works directly from the Web UI. It supports multiple libraries and supports multiple users. It also supports marking progress (want to listen, listening, listened, abandoned, etc.) and writing reviews. Someone even set up a website with kikoeru-project, although I would recommend against doing so.

The main problem with kikoeru-project is that I have not add i18n functionality to it, since most of the users of that project only understands Japanese and Chinese. If any English speaker is interested in an English version, please tell me, then I will make it available someday. Also, if anyone is interested, please tell me where you communicate with each other about DLsite vocie works, what's the name or link to that webpage of yours, and how many people exactly are interested in this project.

NANELLON commented 3 years ago

Yeah, I found that repo recently and see that you get around the issue by generating a uuid instead. Unfortunate that repo wasn't forked from here so it's harder to merge back. Regarding english i18n I'll take a look and open an issue if it seems like people will be interested.

umonaca commented 3 years ago

Yeah, I found that repo recently and see that you get around the issue by generating a uuid instead.

Actually I don't like this way of fix. If I rewrote the code someday, I would replace the surrogate key with natural key, i.e. remove the id column entirely. But because that would require too much refactoring I was just feeling lazy and monkey patched in the uuid way.