Closed EthanYidong closed 3 years ago
Hey, Thank you for the suggestion!
I have not decided yet if I should include content about crawling SPAs as tools executing javascript server-side have (as far as I know) a large surface attack.
That being said I will certainly evaluate it
Actually, WebDriver is a technology that allows you to programmatically interact with browsers (Firefox, Chromium, Safari all support it). Therefore, you get the same sandboxing of a normal browser.
Hi Sylvain! What is the general idea of web crawling in your book? What type of data expected to be found or harvested?
Hi @svirmi, The scope is not 100% fixed yet, so what's following is speculation. The web crawling part is in the 'reconnaissance' part, so we can expect to build crawlers for leaked files, private information and other OSINT materials.
Sylvain, google do it very well. I mean finding files and data. What do you think of ports scanning and/or device detecting tool, like shodan?
I've opened a new issue to discuss this specific topic :) https://github.com/skerkour/black-hat-rust/issues/6
I've finally opted for https://github.com/jonhoo/fantoccini as headless driver because I found the documentation to be a little bit better for this use case :)
Chapter 5 is about web crawling, right? Well one library I've found to be really helpful for that is thirtyfour, a Selenium/WebDriver library for rust. WebDriver is a great technique to use for scraping websites that are SPAs or other apps that load content with JavaScript. Just thought I'd share.