Open dessant opened 5 years ago
Hey @dessant - we saw buster go live - looks like a very cool extension! When we published the first use of this method in 2017, we did a test at the time for the best services to ensemble (see the paper from here: http://uncaptcha.cs.umd.edu), and Google Cloud was the best at the time. We haven't yet repeated that analysis for the new captcha style, but we've found Google Cloud and Bing both at least to be very good for unCaptcha2.
@ecthros, are you aware of any services that offer speech APIs without account signups? In the past we've played with Wit.ai and sphinx, but found these to be less accurate compared to the ones requiring accounts.
Thanks for reaching out!
Great research @Kkevsterrr, and it's awesome that the audio samples have been preserved. I've just made a couple of tests with your audio clips and it's fascinating how much general-purpose speech recognition services have evolved in the past year.
For example, take task23.mp3
from part1.zip
and upload it to this demo: https://cloud.google.com/speech-to-text/. While their default model gives mediocre results, the new video model is near perfect, and the phonetic mapping you have employed in unCaptcha would take care of the rest. These samples are also considerably more difficult to understand than the ones they currently use.
I think they've concluded that further distortion is not feasible without running into accessibility issues, and switched to easier challenges, while giving more weight to user interactions and other signals available in the browser.
Hi @dessant! Thanks for letting us know about your extension. We're publishing this on Github partly because of our responsible disclosure timeline, and we wanted people to know that it's possible to carry out an attack so easily.
You might have better luck trying the Sphinx project (https://cmusphinx.github.io/), although I've always found Microsoft's and Google's Speech to Text to be more effective against ReCaptcha. Something that's interesting though is you might not need a Speech to Text engine at all.
I've run unCaptcha2 thousands of times, and over time, I noticed that there were a few repeat captchas. I saved them all off, and found that they repeated captchas many times - I found some were repeated as many as 13 times. I'd estimate that the number of captchas ReCaptcha can actually choose from is only around 500-1000, which makes it very easy for someone to attack. If you're interested, I can send the captchas along to you - I was hoping to keep them off this repository until Google fixes the system. Shoot me an email at ghughey@umd.edu if you'd like it.
Thanks again!
@ecthros, I'll look into how viable Sphinx is, thanks for bringing it to my attention!
I have also noticed repeating audio samples, but it would be surprising if the audio challenge set is that small. It's possible that there are multiple sets that are rotated as they receive enough training data from human input. Though this is just a guess, and I have not made any methodical tests.
Thanks for the offer, I look forward to explore the samples you have collected, though I can wait until they become public.
No worries! I also wouldn't be surprised if they're rotating samples - possibly based on other factors than time such as IP address.
I've made a browser extension last month that makes use of the same method.
https://github.com/dessant/buster
Buster is an accessibility tool that helps people solve difficult captchas. It works quite well, and you can select between several speech recognition services.
What was your experience regarding the quality of these services? I've found the video model of the Google Cloud Speech API to be the most accurate. Do you know any speech recognition services that have a free API like Wit.ai? I'd like to have more choices besides Google and Wit.ai, without requiring users to sign up for different services themselves.