Closed ocervell closed 1 year ago
Hi @ocervell! Thanks to be interested in the project.
Of course I can add JSON output to cariddi, it's not rocket science and I think it can be useful too.
Unfortunately now I'm really busy with my thesis, work and other projects so I don't know when I will be able to push this new feature. Anyway thanks to point out this thing.
Anyone reading this, if you want to develop this feature just create a PR
Have a nice day
Is this issue still open? If yes, I'd like to work on it.
yes @Tezas-6174 it's open :)
I suggest to continue the work started in https://github.com/edoardottt/cariddi/pull/106. Maybe it would be better to add the flag -oj
instead of -json
just to be consistent with previous output methods:
-oh string
Write the output into an HTML file.
-ot string
Write the output into a TXT file.
the flag should take as input a string and output results like oh and ot do (separate files for results, secret etc. etc.)
Ask me for any doubt
Alright, I'll start working on it and let you know if I get stuck somewhere.
I had a doubt,
New()
function, the function visitHTMLLink()
was not called correctly as 11 arguments are passed, although it takes only four arguments: the link, event, HTML element, and the collector. I changed it to 4 arguments, but I need to figure out what the event is.-json
to -oj
.have you forked the main or the dev branch? I think the dev is some commits forward, you should use that branch for new features
No, actually, the devel
branch is three commits behind the main. I cloned this repo and pulled the changes in the unmerged pull request #106.
ok got it, the problem there is that that development branch has conflicts that must be resolved (frozen PR for a lot of time, the devel branch got updates in the meanwhile). It's better if you clone the devel branch and apply the changes on your own
Sorry for the delay on this PR, didn't have much time on my hands and the linter wouldn't work locally... I can take over it if you want, the conflicts with main shouldn't be very hard to fix.
Sorry Olivier but time passed and Tezas wanted to work on this, you can try to fix the conflicts and realign the changes made with the requests in the previous messages. However, in my opinion is better to avoid "competition" on the same issue and try to distribute the workload on more devs :)
There is a lot of work to be done in this repo and I suggest to choose another issue or create new ones (even more than one! a lot of changes can be added). I will be very happy to help you contributing here
Sure, no worries.
@Tezas-6174 please note that altough -oJ
might be a good option for real JSON file output, imo it's a separate feature from this PR, as it should be distinguished from JSON lines, such as:
cariddi -oJ output.json
would write a formatted JSON file, but cariddi -json
could output JSON lines (in real-time) so that it's consumeable in real time by e.g jq
, in a similar behavior as for example httpx
or subfinder
tools.
Both options are not mutually exclusive either, and we should be able to write cariddi -oJ output.json -json
which would emit JSON lines on the console and save the output to a file(s) at the end of the run.
you know? you're right. However, there are some constraints/problems with this implementation.
-oj
and -json
options.Regarding point 1, maybe it's possible but there could be duplicates in the results.
cc @Tezas-6174
I respectfully disagree ;)
For 1. correct me if I'm wrong but we actually hunt errors, secrets and infos in the c.OnResponse
function so it is accessible on a per-response basis and thus outputable (is this even a valid word ?) in real-time ! This branch has the implementation (and is up-to-date with devel
in case @Tezas-6174 wants to pull from it).
For 2., imo JSON and JSON lines are two different output formats. That said, you could implement both with the same flag, such as when not passing anything to the -oJ flag it would output JSON lines, and when passing a file path it would save results in actual files instead.
I find JSON lines to be a great feature since being able to parse results in real-time to do further actions (pass to other tools like gf
, jq
, or save results to a NoSQL db) is a time-saver for sure. If you look at similar tools (ffuf
, gobuster
, katana
), they all have a way to output real-time.
I forgot to mention the eventual duplicates we could get: I didn't see any in the tests I ran, but if they were, maybe we could keep a cache of URLs visited and prevent re-crawling them (that's actually probably another feature).
My fault, you are correct for both points ahahaha.
-json
it's correct to print it whenever I see that.-oj
and -json
surely we need tests. @ocervell u can start working on that, if u have any issue ping me and I'll be very happy to discuss changes or anything
No problems ! I have updated my PR to match with devel
and pass the linting tests.
@Tezas-6174 , since this PR is pretty much done, maybe open a new PR for the -oj
flag to save output to JSON files ?
Or work on improving it, there might be improvements that could be done that I've not yet seen.
Would be great if the tool could take a
-json
option to output JSON like similar tools (katana, gospider, gau).It could output JSON Lines that way and have a 'type' key for secrets and regex matches etc... instead of putting everything in a folder.
Example output:
Thoughts ?