Farmadupe / vid_dup_finder

vid_dup_finder
Apache License 2.0
7 stars 1 forks source link

Videos less than 30 sec not working #3

Open AlphaHasher opened 2 years ago

AlphaHasher commented 2 years ago

I am aware you mentioned that it does not work for videos under 30 seconds, but I wondered if there were any updates to this. Also, if there is anything I can help with, then just let me know

Farmadupe commented 2 years ago

Actually I have draft code that now works with videos under 30 seconds (hopefully will work with any video with at least 64 frames, regardless of duration).

I have also updated the actual algorithm with much closer hamming distance between duplicated videos, so the new updated code is just 'better' than the old code. (The new algorithm now uses a three-dimensional DCT, the old algorithm used ten two-dimensional DCTs)

In fact, the codebase is in a 'mostly-working' state, so I'll see if I can push it to a new branch in github. I haven't worked on it much recently (it's only a hobby project) so I'm not sure when I'll put more time into finishing it off.

Here's my current tasklist. Feel free to contribute on any of these!

Necessary Tasks (mostly boring)

Useful tasks

Pie in the sky tasks

Ideas for extension

IronCraftMan commented 1 year ago

Any chance you could publish what you have? Would be great to have for others to work on, even if you can't yourself.

Farmadupe commented 1 year ago

I have created branch dct3d in the vid_dup_finder_lib library. It is quite functional and is much better than v0.1.0. It may be good enough quality to publish on crates.io.

Feel free to:

There are two choices of backend library 1) ffmpeg-commandline-interface 2) link to gstreamer shared library. gstreamer is faster but is probably difficult to bind on windows, sometimes causes crashes the entire process due to bugs, and seems to cause a lot of memory fragmentation when decoding a lot of videos of various formats.

So I suggest to keep ffmpeg.

Remaining tasks on my list are:

P.S. the vid_frame_iter crate may also be suitable to release, as many people on reddit ask for a simple interface to decode videos from gstreamer. It is memory-safe and zero-copy.

https://github.com/Farmadupe/vid_dup_finder_lib/tree/dct3d

Th3EvilGod commented 9 months ago

@Farmadupe any ETA, please?