skerkour / black-hat-rust

Applied offensive security with Rust - https://kerkour.com/black-hat-rust
https://kerkour.com/black-hat-rust
MIT License
3.29k stars 363 forks source link

Suggestions for possible library to include in chapter 5 #3

Closed EthanYidong closed 3 years ago

EthanYidong commented 3 years ago

Chapter 5 is about web crawling, right? Well one library I've found to be really helpful for that is thirtyfour, a Selenium/WebDriver library for rust. WebDriver is a great technique to use for scraping websites that are SPAs or other apps that load content with JavaScript. Just thought I'd share.

sylvain101010 commented 3 years ago

Hey, Thank you for the suggestion!

I have not decided yet if I should include content about crawling SPAs as tools executing javascript server-side have (as far as I know) a large surface attack.

That being said I will certainly evaluate it

EthanYidong commented 3 years ago

Actually, WebDriver is a technology that allows you to programmatically interact with browsers (Firefox, Chromium, Safari all support it). Therefore, you get the same sandboxing of a normal browser.

svirmi commented 3 years ago

Hi Sylvain! What is the general idea of web crawling in your book? What type of data expected to be found or harvested?

sylvain101010 commented 3 years ago

Hi @svirmi, The scope is not 100% fixed yet, so what's following is speculation. The web crawling part is in the 'reconnaissance' part, so we can expect to build crawlers for leaked files, private information and other OSINT materials.

svirmi commented 3 years ago

Sylvain, google do it very well. I mean finding files and data. What do you think of ports scanning and/or device detecting tool, like shodan?

sylvain101010 commented 3 years ago

I've opened a new issue to discuss this specific topic :) https://github.com/skerkour/black-hat-rust/issues/6

sylvain101010 commented 3 years ago

I've finally opted for https://github.com/jonhoo/fantoccini as headless driver because I found the documentation to be a little bit better for this use case :)