hydrusvideodeduplicator / hydrus-video-deduplicator

Video Deduplicator for the Hydrus Network
https://hydrusvideodeduplicator.github.io/hydrus-video-deduplicator/
MIT License
41 stars 7 forks source link

Keep track of comparisons / Avoid recomparing files #38

Open Stealthbird97 opened 1 year ago

Stealthbird97 commented 1 year ago

Hi, I apologise if I am simply misunderstanding how the dupe search process actually operates so this issue is purely based on what I assume is happening.

I've noticed that each time you run the program (or in the very least if you Ctrl + C it and then rerun it again), while dupes are currently sent to Hydrus as and when they are found, if you wish to come back at any other point you seemingly have to start the dupe search from the beginning. So it seems that if you get half way through, and you cancel or it crashes, you have to go through all the files you've already checked. I don't think I need to explain why this is not particularly time efficient.

Is there some way that you can track which files have already been compared, so that the dupe processing only compares previously uncompared pairs.

I appreciate that this might require a lot more space in the database, but when you have a lot of files you'd really like to dupe search, the existing mechanism will require me to run the program non-stop with essentially limited use of my computer for weeks.

Stealthbird97 commented 8 months ago

I was using 0.4.1, so given the change logs, I don't think the behaviour I experience has been corrected. If you're making updates which improve performance, that will certainly help (would be good to see some hardware acceleration if thats is on the cards!)