Farmadupe / vid_dup_finder

vid_dup_finder
Apache License 2.0
7 stars 1 forks source link

Inconsistent results ("Too short" messages) #6

Open ShareBugreports opened 10 months ago

ShareBugreports commented 10 months ago

So, i noticed some invalid "too short" messages. I have a directory with a lot of non-movie stuff. Including source/compiled files of vid_dup_finder.

I did a scan a couple of times, inceasing the logfile number each time, like:

rm /home/test/.cache/vid_dup_finder/vid_dup_finder_cache.bin ./target/release/vid_dup_finder --files ~/TEMP/fingerprint/ 2>&1 |tee -a log5

Now we check the output messages (redacted the output. I used the same replacements for each file) number=5; cat log${number} |grep short |sed 's|.*Too short : ||' |sort > log${number}_parsed.log

For example check "log5_compared_to_6_and_7.png". In run 5 we got a "too short" message for "[rarbg]/.mp4". This did not happen in run 6. I got a "inserting" in run 6 for this same file.

The same for "log8_compared_to_9_and_10.png" "cache2/.mp4" failed in run 8. But was inserted in run "9".

As every run the order of the scanned files is different i think some variable is the loop. Or ffmpeg gets confused about all the non-movie files. log5_compared_to_6_and_7 log8_compared_to_9_and_10 log5_parsed.log log6_parsed.log log7_parsed.log log8_parsed.log log9_parsed.log log10_parsed.log