epi052 / feroxbuster

A fast, simple, recursive content discovery tool written in Rust.
https://epi052.github.io/feroxbuster/
MIT License
5.78k stars 482 forks source link

[FEATURE REQUEST] Enable gzip to speed up #890

Closed riramar closed 1 year ago

riramar commented 1 year ago

Is your feature request related to a problem? Please describe. There is no problem. The idea is to make it faster by enabling compression.

Describe the solution you'd like I tried to implement from my side but it didn't work. I'm not familiar with Rust. The solution would be enable this https://docs.rs/reqwest/0.11.16/reqwest/struct.ClientBuilder.html#method.gzip

Describe alternatives you've considered AFAIK there is no alternatives.

Additional context I was checking the requests made by ffuf and feroxbuster and noticed that ffuf enables gzip by default.

aancw commented 1 year ago

@riramar For the testing purpose, do you have website that support this feature? It will help a lot to make sure that implementation is as intended.

Go is enable the gzip by default

https://github.com/ffuf/ffuf/issues/526#issuecomment-1058943633

riramar commented 1 year ago

@aancw you can use https://example.com.

$ curl -sI -H "Accept-Encoding: gzip" https://example.com | grep -i content-encoding
content-encoding: gzip
aancw commented 1 year ago

@aancw you can use https://example.com.

$ curl -sI -H "Accept-Encoding: gzip" https://example.com | grep -i content-encoding
content-encoding: gzip

Okay. Need to make sure if implement this compression doesn't break the tool behavior.

epi052 commented 1 year ago

if the need is to simply add a header to your requests, you could drop a ferox-config.toml in one of the various places in which ferox will search for it.

ferox-config.toml

[headers]
Accept-Encoding = "gzip"

ref: https://epi052.github.io/feroxbuster-docs/docs/configuration/ferox-config-toml/ ref: https://twitter.com/epi052/status/1553383554247262208

epi052 commented 1 year ago

I'm not opposed to making the change internal necessarily, but this can be used until that happens. Also, I can turn it on for myself for awhile and test things out.

aancw commented 1 year ago

Reqwest has a feature for gzip in cargo.toml. The implementation is very simple, add .gzip(true) to every ClientBuilder call instead of using http header. Because when we enable the feature and add gzip(true), reqwest will handle the compression and decompress automatically.

epi052 commented 1 year ago

that feature adds 4M to the final binary :exploding_head:

aancw commented 1 year ago

that feature adds 4M to the final binary :exploding_head:

Is it still worth to implement? 🫣

riramar commented 1 year ago

If you compare with and without compression you'll see the time difference. Quite difficult to determine how many server supports compression. By default I think most of them support that's why browsers always send the Accept-Encoding request header.

epi052 commented 1 year ago

as far as size, i dont think many will care when they use the x86 variants. I could see it being a potential issue for some with the arm builds.

i'm testing with/without gzip enabled using the command below (bitdiscovery has an open bugbounty program)

./feroxbuster -u https://bitdiscovery.com/ --depth 1 -w /wordlists/seclists/Discovery/Web-Content/raft-large-directories.txt

their 404 returns a gzip'd response, which is pretty ideal since over the course of 62K requests, only ~65 aren't 404.

what i'm seeing is that the gzip version isn't any faster than the 2.9.5 release binary.

My intuition is that the additional processing (decompression) on the client side is negating any speed gained by the lower amount of bytes downloaded.

It's probably worth looking into when/where we call the method that downloads the body and see if that work can be offloaded out of the main loop (if it's not already)

epi052 commented 1 year ago

here's the diff i used to test above. just make sure to build with --release if ya'll test anything on your end

diff --git a/Cargo.toml b/Cargo.toml
index f2da1f9..c3787cc 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -35,7 +35,7 @@ tokio = { version = "1.26", features = ["full"] }
 tokio-util = { version = "0.7", features = ["codec"] }
 log = "0.4"
 env_logger = "0.10"
-reqwest = { version = "0.11", features = ["socks"] }
+reqwest = { version = "0.11", features = ["socks", "gzip"] }
 # uses feature unification to add 'serde' to reqwest::Url
 url = { version = "2.2", features = ["serde"] }
 serde_regex = "1.1"
diff --git a/src/client.rs b/src/client.rs
index ee0ecff..70d4e3b 100644
--- a/src/client.rs
+++ b/src/client.rs
@@ -26,6 +26,7 @@ pub fn initialize(
         .timeout(Duration::new(timeout, 0))
         .user_agent(user_agent)
         .danger_accept_invalid_certs(insecure)
+        .gzip(true)
         .default_headers(header_map)
         .redirect(policy)
         .http1_title_case_headers();
riramar commented 1 year ago

Thanks a lot for the feedback! Please fell free to close this issue.

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.