maurosoria / dirsearch

Web path scanner
11.76k stars 2.29k forks source link

Get all files by extension no matter what filename it has #1089

Open redsigma opened 2 years ago

redsigma commented 2 years ago

What is the feature?

I want to make a wordlist where i can search for all files of a specific extension. I don't know the name of these files.

I don't know if i am doing something wrong with my wordlist but i checked the manpage and the issues on this repository and i could not find any meaningful help

My word list looks like this:

.%EXT%
.*.%EXT%
[a-z].%EXT%

For some reason it succesfully finds a file called .html but if it's called test.html then it does not find it. From what i have tried i assume regex is not supported.

I am using the docker image on windows with the following batch script:

docker run -it --rm --name dirsearch ^
  --mount type=bind,source="%path_wordlist%",target=/root/db/dicc.txt ^
  --mount type=bind,source="%cd%/reports/",target="/root/reports/" ^
  --mount type=bind,source="%cd%/logs/",target="/root/logs/" ^
  "dirsearch:v0.4.2" ^
    --threads %thread_number% ^
    --max-rate %total_requests_in_1_sec% ^
    --delay %delay_between_request_in_sec% ^
    --output %path_report% --format simple ^
    --random-agent ^
    -e pdf,html ^
    -u "%URL%"

What is the use case?

I am under the impression that this program only searches specific names, so maybe a feature that retrives files just by their extension would be useful for quick testing.

shelld3v commented 2 years ago

Hi, I don't understand what you are trying to say here, can you explain it clearer, with your current situation and what you expect to get?

redsigma commented 2 years ago

I have a directory with 2 files.

.html
test.html

I want to make a wordlist which shows me those 2 files. I only know the extensions of the files and not their names.

shelld3v commented 2 years ago

So for example you have a wordlist like this:

.html
a.html
b.jsp
c.php

And you want to get only paths that have .html extension (in this case .html and a.html)? Why don't you use regex? Something like [.]html$

redsigma commented 2 years ago

So for example you have a wordlist like this:

.html
a.html
b.jsp
c.php

And you want to get only paths that have .html extension (in this case .html and a.html)? Why don't you use regex? Something like [.]html$

Yes i want to get .html and a.html files. I have a page where there are multiple .html and .pdf files and also subfolders with these files, however not all subfolders have these files and instead of manually checking each folder and subfolder i am using this tool.

I emptied my dictionary file and added only the following

[.].pdf$
[.].html$
[.].%EXT%$

However there is no output . Log file is empty, and there is no report.html file image

shelld3v commented 2 years ago

I think it should be:

[.]pdf$
[.]html$
[.]%EXT%$
redsigma commented 2 years ago

I think it should be:

[.]pdf$
[.]html$
[.]%EXT%$

Ah my bad. I made sure to copy paste that this time, but the output is still the same as before

I am not sure if regex works or maybe it's a problem with case sensitivity. The filenames have mixed uppercase and lower case characters and sometimes characters such as - or _

EDIT: I think i have not pointed this out but the filenames have more than 1 character

shelld3v commented 2 years ago

What bash command (or any way you tried) did you use to filter?

redsigma commented 2 years ago

What bash command (or any way you tried) did you use to filter?

As stated in the description i have run the following docker image (that i build locally using the provisioned docker file )

  "dirsearch:v0.4.2"
    --threads %thread_number% 
    --max-rate %total_requests_in_1_sec% 
    --delay %delay_between_request_in_sec% 
    --output %path_report% --format simple 
    --random-agent 
    -e pdf,html
    -u "%URL%"

I am running this from a windows machine, but i dont think this matters. The dictionary file is mounted from my host machine and i checked that it works by adding a hardcoded filename in it.

The dictionary file which i used to get all .html and .pdf files is the following (but it doesn't work)

[.]pdf$
[.]html$
[.]%EXT%$
shelld3v commented 2 years ago

That should not be what is in the dictionary, dirsearch doesn't support wordlist with regex inside it (maybe something similar in the future?).

Sorry for the late reply anyway!