antiboredom / videogrep

automatic video supercuts with python
https://antiboredom.github.io/videogrep
Other
3.33k stars 257 forks source link

Unexpected behaviour with "fragment" option? #119

Closed smithee77 closed 1 year ago

smithee77 commented 1 year ago

Hi first of all many thanks for this amazing tool. I'm working with --transcribe, using VOSK. Once transcription is done, if for example I run: videogrep --input shell.mp4 --search-type fragment --search 'and'

I got video chunks with "and" instances, but also those with "sand", "wand", "random", ...every word CONTAINING 'and'. Is this the expected behaviour? I just wanna get exact "AND" chunks.

Thanks again

smithee77 commented 1 year ago

Sorry didn't read https://lav.io/notes/videogrep-tutorial/ To exact word, use pattern '^word_to_find$'

antiboredom commented 1 year ago

Hi - glad you like it!

Yes that actually is the expected behavior (and I should probably make this more clear in the documentation). Searching uses Python regular expressions, so you'll get back anything that contains the characters you've entered. If you want to get an exact word match, you can try something like this:

videogrep --input shell.mp4 --search-type fragment --search '^and$'

The ^ characters means "beginning of string" and $ means "end of string", so wrapping any search between ^ and $ gets the exact word.

Let me know if you have any more questions!

antiboredom commented 1 year ago

@smithee77 glad you figured it out! And if you have suggestions for making this more clear to future confused users please let me know :)

smithee77 commented 1 year ago

Many thanks for your reply!! Found the answer like one second before your answer! :)) Yeah, just as a suggestion, default behaviour maybe should be EXACT pattern? Probably it is what main users want to use the tool for... Amazing work you do