epi052 / feroxbuster

A fast, simple, recursive content discovery tool written in Rust.
https://epi052.github.io/feroxbuster/
MIT License
5.54k stars 467 forks source link

[BUG] Parallel Flag no Longer Seems to Do Multiprocessing #1159

Closed NotoriousRebel closed 2 weeks ago

NotoriousRebel commented 4 weeks ago

It's hard to pinpoint when this change happened for example using the debug build back in February the issue does not exist.

Using Ubuntu 22.04.3 LTS and version 2.10.3 via the official releases, I run the tool like so: cat urls.txt | feroxbuster --stdin --parallel 4 --time-limit 8h --threads 8 -k --json --depth 2 --timeout 12 -L 6 -w wordlist.txt -o results_dir

Urls.txt can be anything as long as it's >=4 let's say urls.txt is the following:

https://www.tesla.com
https://www.uber.com
https://www.walmart.com
https://www.example.org
https://www.lyft.com

Doing ps aux | grep feroxbuster you'll notice there are only two feroxbuster processes with one of them processing a url. If you use a older version of the tool there would be 5 processes after you grepped, 1 for the original commandline and the other processes that would be spawned in this case 4 because I did parallel 4.

Under the hood are 4 urls still being processed in parallel and it's no longer reflected from just grepping the process? Or did something change that causes parallel to only do 1 process even though parallel was set to 4 meaning it should be 4 processes?

epi052 commented 3 weeks ago

thanks for reporting! i'll see what i can dig up

NotoriousRebel commented 3 weeks ago

The issue seems to have been introduced within the v2.10.2 release as it does not exist within the v2.10.1 release. Tested using the same command and just doing ps aux | grep ferox

epi052 commented 3 weeks ago

can you try 2.10.3? looks fine on my end on that version

Every 2.0s: ps aux | grep ferox                                                                                                                            

epi       9240  1.5  0.0 777540 25344 pts/2    Sl+  20:56   0:00 target/debug/feroxbuster --stdin --parallel 4 --time-limit 8h --threads 8 -k --json --depth 2 --timeout 12 -L 6 -o stuff
epi       9254  375  0.1 777956 37760 pts/2    Sl+  20:56   0:41 target/debug/feroxbuster --time-limit 8h --threads 8 -k --json --depth 2 --timeout 12 -L 6 -o stufffffff-1717894616.logs
epi       9256  217  0.2 777960 86300 pts/2    Sl+  20:56   0:23 target/debug/feroxbuster --time-limit 8h --threads 8 -k --json --depth 2 --timeout 12 -L 6 -o stufffffff-1717894616.logs
epi       9257 26.5  0.1 777960 36864 pts/2    Sl+  20:56   0:02 target/debug/feroxbuster --time-limit 8h --threads 8 -k --json --depth 2 --timeout 12 -L 6 -o stufffffff-1717894616.logs
epi       9309 85.0  0.1 778420 43248 pts/2    Sl+  20:56   0:08 target/debug/feroxbuster --time-limit 8h --threads 8 -k --json --depth 2 --timeout 12 -L 6 -o stufffffff-1717894616.logs
NotoriousRebel commented 3 weeks ago

I tried 2.10.3 the debug build and the release build with the commands above and still have the same issue. The debug build that I can no longer download worked with the time limit and parallel flag, is it possible to download that again in the meantime? Didn't know github actions leads things to expire. This is on an EC2 instance:

Distributor ID: Ubuntu
Description:    Ubuntu 22.04.3 LTS
Release:        22.04
Codename:       jammy
epi052 commented 3 weeks ago

maybe an EC2'ism? how many cores do you have assigned to the machine?

NotoriousRebel commented 3 weeks ago

Ah you are correct, I guess after that debug build to now something changed with how parallel flag works and it's based on the total number of CPU cores. I checked the EC2 instance and it's only 1 core, I ran feroxbuster on a VM with 2 cores and there were 2 processes when doing parallel. Was it not bound to the number of cores before and changed to being bound to make it more stable as results could be missed if not?

epi052 commented 3 weeks ago

yea, i reworked parallel recently(ish). It slightly changed how the background processes are spawned, so that makes sense.

both the current and prior solution rely on a semaphore to limit the number of processes from progressing beyond a certain point in the code. the main change is that the new code makes use of async in order to stream i/o to a buffer for multi-process output to stdout. The change to async instead of blocking in the tokio task appears to express itself by behaving how you're seeing (and is honestly more appropriate behavior i think).

here's the diff if you're interested: https://github.com/epi052/feroxbuster/commit/9ff0253deb80a95cfdb2419950025da59b4c84df

NotoriousRebel commented 3 weeks ago

It does seem to take a lot longer to run as a result of the change though, is there a trade off in regards to accuracy vs speed? Does using async instead of being block lead to more accurate results, I looked into it a bit more with how tokio handles it under the hood and this explains in depth. I would need to compare the older version vs newer version to do some benchmarking but that's something for down the road.

epi052 commented 3 weeks ago

Did you use a debug build? That'll be a lot slower due to lack of compiler optimizations

NotoriousRebel commented 3 weeks ago

Turns out it's on me haha, I was using a release build but the wordlist I was using became bloated to 398,676 words instead of the smaller wordlist I usually use. I am still curious about the benchmarking in regards to accuracy vs speed with the new changes as it makes sense how Tokio does it by default althoug you can still manually configure the number of worker threads. Would be interesting to see to compare the older verison vs newer version on a low CPU core machine and compare performance.

epi052 commented 3 weeks ago

glad you figured it out!

i typically do some very un-scientific experiments when i change things like this. performance would have been either roughly the same or better for me to have made the swap (i don't recall exactly). i'd be happy to see any results you come up with though if you decide to play with it