Closed prof-m closed 1 year ago
I figured this would happen, but I haven't taken the time to fix it or get a good test suite of GIFs or extremely short videos.
The problem is going to be in the main hashing function here.
They tried to fix it with by checking if int frameMod = secondsPerHash * framesPerSec
was 0 but didn't bother checking for the divide by zero here :
pdqHashes.push_back({pdqHash, fno, quality, (double)fno / framesPerSec})
I'm pretty sure that timestamp isn't even used so it could probably be removed, but I'll probably fix it by just checking if the seconds_per_hash < total frames / FPS and hashing every frame if it is.
If you have time let me know if running FFprobe on the file gives you an actual FPS for the video > 0 by doing:
ffprobe -v quiet -show_streams -select_streams v:0 badgif.gif
You can see what file is tripping it up with --debug
and then you look up the hash in Hydrus with system:hash.
Just thinking about it, GIFs should probably be hashed every frame either way. For now if this problem is annoying run with the query system:duration > 1s
. If the duration is greater than 1s there shouldn't be an issue.
@appleappleapplenanner Ah, the joys of taking something built for one thing and using it for an entirely different thing 😂
Here's the output I got from one of the files that had the error.
Specific line I think you're looking for:
avg_frame_rate=0/0
Oh yeesh, yeah, I just looked at the code you linked in your first reply, and I see what you mean - they've got no coverage for frames per second being zero, despite covering for frameMod
. Though I guess, to be fair, framesPerSecond
should never be 0 for any kind of image, but it's still bad error handling on their part. Passing in the r_frame_rate
as the framesPerSecond
argument makes sense to me, if avg_frame_rate
is going to give results like 0/0
Should be fixed in hvdvpdq 0.0.12. Run pip3 install hvdvpdq -U
to update and test please.
@appleappleapplenanner That seemed to work! I'll need to let it run on some more files before I'm positive, but I'm no longer seeing the ZeroDivision error pop up when running with --verbose
. Thanks for the quick fix!
This has cleared the way for me to notice that I'm seeing a lot of instances of UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 1934: invalid continuation byte
now, but I'll open a separate issue with a proper write-up for that tomorrow.
Issue already closed, but just confirming that I ran a search on all of my files and none of them failed, so I think we're most likely good here!
Testing it out on my local library, which has a nice mix of gifs and videos, and I noticed one error coming up consistently:
I took a look at all the files that error got thrown for, and they had the following things in common:
Based on the stracktrace, my first thought was that the
seconds_per_hash
calculation was getting tripped up by files that have either 2 frames, or that have a higher framerate than number of frames. But, after taking a look at the code, I'm not so sure? Because as far as I can tell,seconds_per_hash
never even gets passed intohash_file_compact()
at all - the only calls tohash_file_compact()
that I can see only pass in the filepath, and nothing else, sohash_file_compact()
always gets called with it'sseconds_per_hash
param set to 1.At which point we get to the inner workings of the vdqp library itself, but I haven't looked into the code there yet to see what's making it try to divide by zero. Will update the issue with more info if I figure it out before someone else has a chance to take a look