d3mondev / puredns

Puredns is a fast domain resolver and subdomain bruteforcing tool that can accurately filter out wildcard subdomains and DNS poisoned entries.
GNU General Public License v3.0
1.65k stars 155 forks source link

[Feature request] Another output formats #9

Closed blacksper closed 3 years ago

blacksper commented 3 years ago

It would be really cool if you add json output format like in massdns, I think it's not hard to create new option and let the people set it by themselves

d3mondev commented 3 years ago

What additional information would be useful to you in the json and what would be the use case?

Edit: Thanks for the suggestion!

blacksper commented 3 years ago

I need exactly the same format as using in massdns json because of my software pipeline :) Basicly, for my case would be usefull such things: DOMAIN, CNAME, IP In massdns it works like that: Lets say we have domain a.qwerty.com with cname b.qwerty.com and ip: 1.2.3.4 Massdns makes an output like that (if I remember correctly): {"domain":"a.qwerty.com","cname":"b.qwerty.com"} {"domain":"b.qwerty.com","ip":"1.2.3.4"}

FatihEgbatan commented 3 years ago

@d3mondev amazing work!

I was investigating this today, and massdns output as json can be handy while parsing massdns results

d3mondev commented 3 years ago

Unfortunately it's not that simple. I did implement the json parsing that massdns outputs while developing puredns 2.0 but it came with multiple downsides that made it too slow and too memory intensive to be practical.

One of the goals that I have with puredns is to support lower-end VPS (1 CPU / 1GB RAM is the target) so that it can be used in pipelines that are heavily distributed with lower-cost instances.

The json output is about 5 times bigger on disk than what is produced by the simple text output. That may not seem like a lot, but it makes a big difference when the VPS has limited disk space and you're working with files that are in the hundred of millions domains. To solve this, I skipped saving the file to disk and instead parsed the content directly from the massdns stdout. Unfortunately I hit the next roadblock: parsing the json is much more expensive on the CPU and it was affecting the throughput of puredns too much. It was a clear downside from puredns v1.0, despite spending a lot of time implementing a custom json parser and optimizing it.

In the end, my goal is to remove the dependency on massdns altogether and perform DNS resolving internally. As such, I stopped looking into solutions to optimize the interactions between puredns and massdns as parsing the simple text output is good enough. Because of this, it is very unlikely that I add new output formats until massdns is removed from puredns (maybe in v3.0, if it ever gets there), at which point I could take more liberties into saving extra data in json or other formats.

I'm closing this suggestion for now as it's not on the roadmap at the moment. Thank you for your suggestion!