biglocalnews / court-scraper

Scrapers for U.S. county court sites.
ISC License
59 stars 18 forks source link

Add Captcha service support #6

Closed zstumgoren closed 3 years ago

zstumgoren commented 4 years ago

Court sites often use Captchas. We'll need to build in support for Captcha services that can be dropped in where needed.

ryanelittle commented 4 years ago

Support for captchas introduced in OK dev. This version supports recaptcha V2 with a base library to support other types.

ryanelittle commented 4 years ago

I have refactored the captcha solving classes to handle multiple types of captchas. This design required its own folder. The library now supports the two most common types of captchas we have encountered: Recaptcha Version 2 and its invisible counterpart.

Each require different strategies to inject the solution into the page and successfully bypass the captcha. The non-invisible kind has more variables in the way you move past the captcha and provides the opportunity for an xpath or javascript to be supplied by the developer or the option to inherit the class into a region or platform-specific class and overwrite the submit function entirely with a function that is a multi-line, tailored script.