hydrusvideodeduplicator / hydrus-video-deduplicator

Video Deduplicator for the Hydrus Network
https://hydrusvideodeduplicator.github.io/hydrus-video-deduplicator/
MIT License
41 stars 7 forks source link

Support Deferred Sending of Duplicate Results #33

Open prof-m opened 1 year ago

prof-m commented 1 year ago

its me, ya boy, back again with the potential dupes queue πŸ‘‰πŸΌ πŸ‘‰πŸΌ πŸ•ΆοΈ

Feature

The sending of found duplicates to the Hydrus Client is deferred until the end of the duplicate search process. If any duplicates cannot be sent to the Hydrus Client at that time (e.g. if the client is offline), those duplicates are saved offline for sending in the future.

In other words, put your duplicates in a queue for safekeeping

Rationale

Easy stuff

For example, let's take a database with 10,000 videos already phashed and dupe searched in it. The program is run, and finds 50 new videos in the client, all with similar durations to the videos already phashed. It phashes the 50 videos and adds them to the database. Then, it compares each of the 10,050 videos against the 50 new videos. Even if comparing a single video against the 50 new videos is a pretty quick process (and it is, thanks to parallelizing), comparing every single phashed video against the 50 new videos takes a lot longer than just phashing the 50 new videos.

Considerations

PR: #34