hydrusnetwork / hydrus

A personal booru-style media tagger that can import files and tags from your hard drive and popular websites. Content can be shared with other users via user-run servers.
http://hydrusnetwork.github.io/hydrus/
Other
2.37k stars 156 forks source link

Add duplicate filter score weight for lower filesize in pixel-for-pixel duplicates #822

Open roachcord3 opened 3 years ago

roachcord3 commented 3 years ago

Right now we have weights for higher quality, higher resolution, higher filesize, more tags, and earlier import time.

But when it comes to pixel-for-pixel dupes, the first two don't matter at all, and when it comes to filesize, you pretty much always might want the smaller file, not the larger one (to deal with stuff like poorly compressed pngs.)

Please add a weight for lower filesize in pixel-for-pixel duplicates, even trivially lower filesize. And please make sure the higher filesize weight does not apply.

Related: #745 since it would help highlight the size difference in the UI.

edit: after lots of time spent with negative values in the weights since that feature was added, I've begun to find that the smaller file isn't necessarily better, but regardless, I do think having a separate set of weights for pixel-for-pixel dupes compared to other dupes would be beneficial. Most of the time, a significantly larger pixel-for-pixel dupe is inferior, but not so if they are not pixel-for-pixel dupes, especially when comparing JPEGs. Trivially larger pixel-for-pixel dupes are (in my current, not past, opinion) usually superior, on the other hand, whereas it's a toss-up with non-pixel-for-pixel dupes (again, comparing JPEGs, I've found that a trivially larger one might have worse-looking compression anyway.)

Zweibach commented 3 years ago

Just being able to input negative values into the weights would go some way to this.

roachcord3 commented 3 years ago

@Zweibach negative weights for larger filesize would be worse for files that aren't pixel-for-pixel dupes, at least for me, so it would just be trading one problem for another. That said, it seems like an easy addition so I hope dev does that too. Just, it won't help close this feature request at all