Closed icesmartjuan closed 1 year ago
Hi @icesmartjuan,
Let me use two examples.
Let's assume that the site you want to scrape uses Distil Networks protection. By doing finite work on reverse engineering the fingerprinting script you can determine the detection surface quite precisely. All because it's written (mostly) in JavaScript. An example of such a solution can be found here.
Many websites use Recaptcha/hCaptcha to limit bot traffic. For example, Alibaba asks for a solution when logging into an account. Until recently, after passing the captcha gate, the logged-in user session was not limited by the number of requests on their website. It was therefore sufficient to "manually" generate a pool of session tokens (create and sign in with an account) and then (without changing the IP) use them to scrape at a reasonable rate-limit.
Generally, such a strategy makes sense when you are targeting a limited number of sites and have the resources to reverse engineer detection scripts. However, you have to reckon with the fact that every time you change a detection script, your solution will be prone to detection.
I see, much appreciated @niespodd
Hi @niespodd ,
Great summary on browser-fp! would you please share more details on the anti-anti-bot solution
Specialized bot software that targets the unique detection surface of the target website
?looking forward to your comments, thank you!