beville / ComicStreamer

Apache License 2.0
166 stars 92 forks source link

Comic identifier #40

Open ghost opened 8 years ago

ghost commented 8 years ago

I am looking for a way to find doubles regardless of the archive type.

I am thinking about

unique hash of the filesizes of all the pages

such that moved comics can be enabled again in the library instead of deleted and as bonus i could create a list of doubles. which would probably save me a lot of gig's.

Anybody any thought on that?

ghost commented 7 years ago

Actually implemented it, somewhat different then mentioned above and for fun run it on a remote 50k db, found 8% doubles...

Now adding an option to tag the files (probably add the text double to the filename) and an option to scan a folder without actually adding it to the db and tag the doubles from that...

Should make life a little bit easier for comic maintenance

Also now possible is library integrity but not implimented yet, scan all the files if they match the fingerprint stored, against bittrot & corruption, also now the cache, favorites,... survives moving, renaming, retagging,... (exception is of course if you add, resize, change format or delete any of the images) And probably a bunch of cool stuff

Comic fingerprinting :-)