NikolaiT / se-scraper

Javascript scraping module based on puppeteer for many different search engines...
https://scrapeulous.com/
Apache License 2.0
543 stars 123 forks source link

Speech to text recognition #10

Closed shadaxv closed 5 years ago

shadaxv commented 5 years ago

For speech recognition you can use google cloud speech to text API (60 minutes of speech is free for each month) https://github.com/googleapis/nodejs-speech Or you can try to implement SpeechRecognition() in page evaluation https://youtu.be/0mJC0A72Fnw

NikolaiT commented 5 years ago

The biggest problem right now is to obtain the audio file from the recaptcha v2 in the first place.

By merely navigating via puppeteer to the audio link like I do, recaptcha tends to block me.

I think you need to automate the browser by clicking, dragging and navigating with the mouse like in https://github.com/ecthros/uncaptcha2 .

But it's strange. I mean audio captchas are mainly there for visually impaired people. Maybe recaptcha v2 looks for behavior linked to visually impairment and rejects if it does not detect it.

Any idea?