projectdiscovery / nuclei-templates

Community curated list of templates for the nuclei engine to find security vulnerabilities.
https://github.com/projectdiscovery/nuclei
MIT License
8.42k stars 2.4k forks source link

Add `recommended.yaml` to run curated templates #8674

Closed princechaddha closed 6 months ago

princechaddha commented 7 months ago

We are planning to introduce a recommended.yaml file, which will contain a curated list of templates specifically chosen for their efficiency and relevance. The primary goal is to offer a streamlined and more focused scanning process by default, thereby avoiding the often exhaustive and less relevant results of a full template scan.

We are also contemplating making the recommended.yaml the standard default for all Nuclei scans. This would mark a shift from running all available templates to a more selective approach by default. As we have been adding many useful information templates, we recognize that they might not be useful for everyone and could result in an overload of scan results.

This can be a major change and would greatly value the community's input on this. Do you feel that making recommended.yaml the default behavior aligns with your scanning needs? We are eager to hear your thoughts, concerns, and any suggestions you might have regarding this proposed change.

Example recommended.yaml:

exclude-severity:
  - info

exclude-tags:
  - tech
  - dos
  - fuzz
  - creds-stuffing
  - token-spray

exclude-protocols:
  - ssl
  - dns
  - file
  - code
  - whois
  - headless
  - workflow

exclude-id:
  - generic-tokens
  - credentials-disclosure
  - CVE-2021-28164
  - ....
olearycrew commented 7 months ago

I wonder if the way to implement this wouldn't be an additional YAML file, which could "break" existing setups, but instead changing the default config.yaml that comes with Nuclei.

That way, when existing users update, there would be no change. But all new users would get a set of "sane defaults"

alex700 commented 7 months ago

It might fit the ignore config better. To ignore info severity, some tags, etc. Otherwise, multiple configs will conflict with each other and confuse users because of priority inconsistency.

mastercho commented 7 months ago

This would affect my workflow, which we scan multiple clients sites. We already exclude unwanted templates in config.yaml anyway so this is big NO from me

ehsandeep commented 7 months ago

we can start this as optional feature and later think about using as default at point when running all templates as default becomes unrealistic.

also we can't do this using default config file, as default config file is not a requirement to run nuclei, making this as default will make it required and not all systems are writeable to write config file upon installation, so it has be using additonal config file, for example recommended.yaml that can be loaded when optional feature get used.

princechaddha commented 7 months ago

Just to add, whenever we plan to make it the default, if we do so in the near future, there will be an option to not use the default scanning and instead run a full scan without impacting the existing workflows of any user, @mastercho

mastercho commented 7 months ago

Just to add, whenever we plan to make it the default, if we do so in the near future, there will be an option to not use the default scanning and instead run a full scan without impacting the existing workflows of any user, @mastercho

yeah that would work

mastercho commented 7 months ago

There is more important things to do before that anyway, like 95% of RCE templates are false-negative in most cases because cat /etc/passwdis given and 90% of serves doesn't allow that

princechaddha commented 7 months ago

@mastercho Thanks for pointing out the issues with RCE templates. Could you elaborate on your observation about the 95% false negatives due to cat /etc/passwd being blocked and 90% of serves don't allow that? Is this mainly due to WAFs or something else?

Let's discuss this in more detail on Discord DM to understand and address the issue effectively

6mile commented 7 months ago

Is the primary intent here solely to speed up the scanning, or to focus on making the scans produce better results? Because for at least the latter, it depends on the use case of the person running Nuclei, and what the target is. I don't think we can make assumptions about what is the best or most efficient way to run Nuclei, because that really depends on why you're using Nuclei.
For example, I use Nuclei multiple times a day, for very different reasons: Teck stack recon, generalized vuln scanning, targeted CVE scanning, etc. That's probably the case for many people that use Nuclei. So, I'm not sure that there is one use case that you could "recommend" in a default config. So, let's circle back to what we are trying to deliver with this optimization: If we are trying to optimize for both speed and better output for most people, maybe a better way to deliver that is to add some conditional logic at the beginning of a scan by default. Conditional if/then/else logic that would use tech detection templates or workflows at the beginning of the scan to identify things like operating system, cloud provider, target type and then deliver a recommended scan for that use case and target based on logic. A good example is simply whether the target is a single host, or a list of hosts. If its a single host, then Nuclei could quickly determine, what the web service and OS are using tech detection templates, and then run additional templates based on that first set of detections. If its a list of hosts, then it would run a more streamlined set of templates.
Workflows can do some of this now, but my experience is that most people don't use workflows until they are much more familiar with Nuclei. Same thing for the new code execution within templates, its a great feature but most people won't take advantage of it for a while.
Right now Nuclei is a great collection of yaml based detections with no logic at the start to point the process in the right way. Maybe the right move here is to add that logic, rather than assuming what a "default" Nuclei scan is.

mastercho commented 7 months ago

@mastercho Thanks for pointing out the issues with RCE templates. Could you elaborate on your observation about the 95% false negatives due to cat /etc/passwd being blocked and 90% of serves don't allow that? Is this mainly due to WAFs or something else?

Let's discuss this in more detail on Discord DM to understand and address the issue effectively

Only if you were answering my friend :)

princechaddha commented 7 months ago

@6mile Thank you for your response. You made an excellent point. It's not only about optimizing the speed of the scan but also about the quality and relevance of the results generated. Unlike some other scanners, we aim to avoid bombarding users with overly informative results that might only be useful to a very specific subset of users.

For instance, there's an issue created here regarding the addition of OSINT templates. These templates would involve extracting information like Facebook, Gmail, phone numbers, documents, or crypto addresses related to the hosts. While such information can be incredibly valuable for OSINT purposes, it might not be relevant or useful for everyone else. In fact, for many users, this could result in an overwhelming amount of data for every host. I'm curious to hear your thoughts on this

6mile commented 6 months ago

I've got several observations: First, as I mention above the relevancy of a Nuclei scan depends on context: Why I'm running the scan.
Personally, the context for me personally changes throughout the day: Am I doing high level tech stack research? Am I targeting a specific website for pentesting? Am I targeting a domain in scope for a bug bounty program? Each of these comes with different characteristics: Single host vs list of hosts; Tech identification; Fully invasive pentesting templates that brute force passwords, etc. Those are all very different ways to run Nuclei so the recommendation would be different for each context.
Maybe the best thing to do is to add a label to a scan type that would map to each of those different contexts. Maybe something like this:

nuclei -label <labelname> -u https://example.org

The labels could be something like: osint, techstack, bugbounty, pentest, and fullscan. Each of those labels would include workflows, tags and templates as part of a recommendation for that particular use case. You can think about this in a similar way to how semgrep handles different types of scans. semgrep --config "p/owasp-top-ten" will do a general vulnerability scan while semgrep --config "p/r2c-ci" will put semgrep into CI mode and semgrep --config auto tells semgrep to automatically detect what rules to run. I think Nuclei needs something like that.

ehsandeep commented 6 months ago

I've got several observations: First, as I mention above the relevancy of a Nuclei scan depends on context: Why I'm running the scan. Personally, the context for me personally changes throughout the day: Am I doing high level tech stack research? Am I targeting a specific website for pentesting? Am I targeting a domain in scope for a bug bounty program? Each of these comes with different characteristics: Single host vs list of hosts; Tech identification; Fully invasive pentesting templates that brute force passwords, etc. Those are all very different ways to run Nuclei so the recommendation would be different for each context. Maybe the best thing to do is to add a label to a scan type that would map to each of those different contexts. Maybe something like this:

nuclei -label <labelname> -u https://example.org

The labels could be something like: osint, techstack, bugbounty, pentest, and fullscan. Each of those labels would include workflows, tags and templates as part of a recommendation for that particular use case. You can think about this in a similar way to how semgrep handles different types of scans. semgrep --config "p/owasp-top-ten" will do a general vulnerability scan while semgrep --config "p/r2c-ci" will put semgrep into CI mode and semgrep --config auto tells semgrep to automatically detect what rules to run. I think Nuclei needs something like that.

This is indeed great point, in fact this is already possible and suppored in nuclei with config option, we just need to create different scan profile configs, starting with recommended.yml - https://github.com/projectdiscovery/nuclei-templates/pull/8829