Bionus / imgbrd-grabber

Very customizable imageboard/booru downloader with powerful filenaming features.
https://www.bionus.org/imgbrd-grabber/
Apache License 2.0
2.42k stars 212 forks source link

Duplicate files still being downloaded despite MD5 checker active #2951

Open FuzzyWuggzy opened 1 year ago

FuzzyWuggzy commented 1 year ago

I recently started using the MD5 checker during batch downloads so that I wouldn't have to use external tools, but I've discovered that with some sources it doesn't skip the identical post and downloads it anyway, there are no filename conflicts as I use the %website% token.

A quick example is if I search md5:6024ece49f9e4c8067692ee956c55a1f, it returns the file on three sites, konachan.com, tbib.org (non standard site) and yande.re, it downloads the post on the first site, fails the MD5 check and downloads the post on the second, then finally acknowledges that the third is identical and skips it, this seems to be an issue with sources like tbib.org

Here's a log of the download:

Grabber_log.txt

Thank you

Bionus commented 1 year ago

I see two main issues that seem to be causing your problem here:

Any reason you overrode the default tbib.org settings to use HTML over XML?