carykh / jumpcutter

Automatically edits vidx. Explanation here: https://www.youtube.com/watch?v=DQ8orIurGxw
MIT License
3.07k stars 544 forks source link

[Suggestion] Speech Recognition in place of/alongside Hand Recognition #61

Open KoolenDasheppi opened 5 years ago

KoolenDasheppi commented 5 years ago

I saw this in a YouTube comment (credits to CodeParade), so not an original idea, but maybe we could use speech recognition in place of hand gesture recognition. Or have both and have an option to use either one. image

pniedzwiedzinski commented 5 years ago

I was thinking about a whistle or something like that. Maybe airhorn 😆. It would be simpler

Ethelt commented 5 years ago

I'm working on it. I'm trying out Wit.AI, it unfortunately needs connection to the internet, but offline recognizer (CMU Sphinx) was too inaccurate. It isn't perfect, but it's good enough and, most importantly, free. Now I have to figure out "safe words" and a way to cut the video.

ozturkberkay commented 5 years ago

@Ethelt I did the same things as you did, but the lack of accurate offline solutions discouraged me to be honest. Wit.ai isn't that much accurate either.

Ethelt commented 5 years ago

@ozturkberkay It's better than alternatives, so I'm trying to see what I can do with it. For now I'm thinking what to do to check when was given phrase said, but I think I will have to dig deeper into wit than simply using SpeechRecognition module to do that. Big advantage of Hand Recognition is the ability to pinpoint extacly when symbol was shown. I'm wondering about using face recognition/body pose recognition, but I haven't looked into it yet.