nccgroup / sobelow

Security-focused static analysis for the Phoenix Framework
Apache License 2.0
1.66k stars 92 forks source link

Add findings guide #136

Closed realcorvus closed 1 year ago

realcorvus commented 1 year ago

This PR adds documentation to individual Sobelow findings. No code changes at all, just the documentation. I wrote all this for integration with the Paraxial.io Application Secure product, and hope it's useful to the community as well.

houllette commented 1 year ago

I have some other mid/high-level thoughts about this, working on a response now - standby! 🙈

houllette commented 1 year ago

Thank you for the contribution, @realcorvus! I have some thoughts around the content that I will outline below.

TL;DR

Ultimately I think what may be best with this PR is instead of merging it; take chunks of it, distilling it down and merging it more tactically into the existing moduledocs for the respective findings - that way more information for the findings themselves are broken out of this singular place and instead become more tightly integrated with the code / finding that they belong with.

I already have a PR in progress to add all the missing moduledocs to the findings (and more) - so you are welcome to add to that or wait for the boilerplate to get merged in at some point and work off your own fork.

Taking this approach also has the added benefit of being a friendly reminder to keep triage instructions up-to-date if the underlying detection mechanism changes due to improvements in Sobelow.

High-Level Quandaries

Below are some more thoughts / explanations to how I arrived at the TL;DR above.

Prescriptiveness

The very nature of this Findings Guide is purposefully very prescriptive / opinionated and I'm unsure if it's a stance the Sobelow tool should take.

I think that while this tool is capable of being a vessel for education, its primary use case is for detecting and preventing vulnerabilities. By taking on as much specificity for triaging findings as this Finding Guide attempts to do, I worry that we are introducing too much nuance in terms of what the tool is responsible for.

What happens when someone following the guide is still vulnerable? Is it our responsibility to make the guide thorough enough to cover all edge cases?

With this mindset, I turn to what other similar tools do when it comes to prescribing triage. In the case of Semgrep, those findings only go so far as to link to relevant community-driven resources and sometimes provide an autofix suggestion. e.g. python rule

Out-of-date Content

The ultimate goal with future development of Sobelow is to increase its ability to confidently detect vulnerabilities in Elixir projects (more on that later), as such it will change a lot under the hood and I find this guide almost so thorough that it would become out of date very quickly as the underlying mechanisms change or new features are added.

I think we can combat this by being less prescriptive (see above) and more concise in the moduledocs with fixes to True Positives for findings.

Conflict of Use-Case

Something to keep in mind is developer usage of Sobelow, either it's run locally (mix task / standalone) or run in CI (mix task / GitHub Action).

When a user gets a detection, do they then have to know to go to the HexDocs / this findings guide? Seems like a better approach would be to integrate this with what is available already via moduledocs and capitalize on the -d CLI argument to get more info on a specific finding right there in the terminal or in the logged output of the CI job.

Future of Sobelow

I've been very slow to write out the planned roadmap for Sobelow which will paint this picture better (I will come back to this response later and edit in a link to it once it's up).

In its current form, Sobelow is pretty rudimentary in how it detects insecurity. So when I think about some of the core functionality I could see being introduced into Sobelow or an accompanying tool to address this weakness, how would it change the underlying usage of Sobelow and/or the user's consumption of resources like this Findings Guide.

For example here are just three of the initiatives I'd like to see Sobelow support soon. Think about how if they were present today, how would it change the content this Findings Guide?:

  1. Taint Analysis - being able to track if user supplied (bad) data is finding its way into known bad patterns
  2. Autofixing Code - we know what the bad patterns are and generally what the fix is / where to apply it, why not allow for a CLI flag to make the changes if possible?
  3. Improved UI/UX - new ways to run and interface with findings for Sobelow; LiveDashboard, a dedicated local liveview app, etc.

So know the question may become, "Alright, that's great for the future - but wouldn't it be better to just have something today?" which is fair - I'm just saying that much of the messaging in this Finding Guide would suddenly conflict and/or be inaccurate if we were to take those features into account and if the future of the project includes the aforementioned features: do we really want to introduce this type of explicit guide today just to later gut it?