Closed github-learning-lab[bot] closed 3 years ago
ok
Alright, first we'll do a few quick things to get you up and running.
At a high level, here's what we're going to do:
Join the r2c Community Slack - There's a channel for this workshop you can ask questions in, and we'll use it to set up notifications when Semgrep finds issues.
Create a free Semgrep App account - This lets us easily manage Semgrep in CI, set up notifications, configure scanning policy, view results over time, and more.
https://your_slacks_name.slack.com/apps
), search "Incoming WebHooks", and in "Post to Channel" choose your name. This way, all notifications are going to be sent to you via direct message.https://hooks.slack.com/services/...
) and go to the Semgrep Integrations page (you may need to click on "Integrations" in the left hand side navbar), create a new integration, select "Slack", provide a name, paste in the webhook url, then save it.Feel free to join the r2c community Slack and ask questions in #general
or #workshop-2021-owasp-devslop
if anything is unclear.
ok
Great! Now we're going to set up Semgrep scanning every PR via GitHub actions by creating a semgrep.yml
.
Though we're going to be using GitHub Actions in this workshop, because Semgrep is nice and portable, easily runnable as a standalone binary or Docker, it's pretty easy to set up Semgrep in pretty much any CI platform under the sun.
See these docs for info about setting up Semgrep in GitLab, Buildkit, CircleCI, or other providers, and see here for more info about Semgrep in CI.
semgrep.yml
to repos you want to onboard, etc.).intro-to-semgrep
repo. If you want to add more repos, you select "All repositories" or hand select a few more. You can always update this later via your GitHub profile Installed Applications settings.intro-to-semgrep
repo row.intro-to-semgrep
repo.
semgrep.yml
GitHub Action) to this repo.ok 3
Great! I've merged in the Semgrep config you set up into this branch so we can iterate on rules and see the results right in this PR.
After the check suite finishes, you should see a PR comment warning about the use of eval()
in the code this PR is adding in eval_test.ts.
And check your notifications in the r2c community Slack, you should see a message from the webhook flagging this issue as well.
One of the key differentiators about Semgrep is how easy it is to write custom rules.
This fundamentally changes how you can leverage static analysis to scale your AppSec program.
Rather than being a black box, one-size-fits-all, "I sure hope the vendor built all the use cases I could ever need," single purpose tool, Semgrep is a Swiss army knife and your imagination is the limit.
Yes, there are over 1,000 out-of-the-box security checks you get for free.
But you can also use Semgrep for:
internal_auth
library for all auth purposes."something dangerous like crypto or parsing XML
, here's how we do it in our company: link to internal docs
."foo()
should always be called before bar()
, else it's a bug."As Semgrep rules look just like the code you're targeting (with some helpful abstractions), many developers and engineering orgs can write custom rules as well (or better!) than security teams.
Why have separate tools when developers and the security team can solve their respective problems with the same tool! π€
Alright, let's get into it.
For these exercises we're going to be using the Semgrep playground: https://semgrep.dev/editor, as it's a convenient way to iterate on rules right from your browser, without installing anything.
If you'd prefer, you can also write Semgrep rules offline in your IDE of choice. After all, they're just YAML!
This is the rule we're going to start on, open it in a separate browser tab: https://semgrep.dev/s/clintgibler:juice-shop-eval-try.
In the top left, you can select a "Language" for the rule you're currently writing. In this case, we're using "TypeScript," because Juice Shop is mostly in TypeScript.
The "code is" section is where you write your Semgrep rule.
The // ruleid:juice-shop-eval
comments you see in the Test Code are a special syntax - they're telling Semgrep, "Hey, I expect Semgrep to find a match here."
If you click on the "Advanced" tab (next to "Simple" under the "Semgrep Rule" header on the left hand side), you'll see the raw YAML for the Semgrep rule you're writing. The "Simple" view is just a simplified interface so you don't have to write raw YAML and mess with indentation, etc.
At a high level, Semgrep rules are just the code you're targeting + a few abstractions.
Sometimes you want to abstract away some details from the code you're matching, to make it more generic.
The ellipsis operator (...
) lets you match zero or more arguments, statements, and more.
Here are a few examples:
// insecure_function(...) would match
insecure_function("MALICIOUS_STRING", arg1, arg2)
// var x = ...; would match each of these
var x = "semgrep";
var x = foo && bar || baz;
var x = foo(something);
You can think of the ellipsis operator like .*
in regular expressions.
Sometimes you want to match something, but you don't know what it is ahead of time.
For example, the name of a function, the value of an argument, and so forth.
Metavariables let you do that by using an identifier that starts with a $
and is only uppercase letters, _
, or digits. $X
or $FOO
for example.
Here are a few examples:
// foo($X)
foo(1); // matches, $X = 1
foo(a); // matches, $X = "a"
// foo($X) doesn't match, foo() called with >1 arg
foo(a, b, c);
// Ellipsis operator and metavariables can be combined!
foo(a, b, c); // foo($X, ...) matches, $X = a
foo(a, b, c); // foo(..., $Y) matches, $Y = c
foo(a); // foo(..., $Y) matches, $Y = a
Note that within one pattern, metavariables are enforced to be the same.
So:
// bar($X, $X)
bar(a, a) // matches
bar(10, 10) // matches
bar(a, b) // does not match, a != b
You can think of metavariables kind of like capture groups in regular expressions.
Sometimes you want to combine Semgrep patterns, like:
a()
or b()
foo()
but not if the first parameter is a string literalbar()
but only if it occurs inside the MyClass
class.You can add additional pattern clauses in the simple editor by clicking the +
button on the right hand side of the pattern.
Currently on a few Semgrep operators are available in the simple editor. See the rule syntax docs for all of the tools in your Semgrep rule writing toolbelt.
We'll cover a number of Semgrep's capabilities in this lab, but there are many we won't!
Navigate to https://semgrep.dev/s/clintgibler:juice-shop-eval-try.
TODO
) to match all calls to my_eval()
, regardless of the passed in arguments.my_eval()
with only 1 argument.my_eval()
when the first argument is not a string literal.Hints
my_eval()
...
.
my_eval()
with 1 argument$ARG
.
my_eval()
where the first argument is not a string literal"..."
will match any string, regardless of its value (docs).
pattern-not
filters out matches.
+
button to add a new pattern and select "and is not", which if you switch to the Advanced view, you can see is represented by pattern-not
under the hood.
my_eval()
rule.ok
Holy moly, you:
...all in a few minutes π€―
Are there comments you (or developers at your company) often write on PRs?
Wouldn't it be nice if you could automate that work and spend your time on higher leveraged things? I think you know where I'm going with this π
Oh another thing - did you notice how easy it was to add new rules you write to your scanning policy, with one click from the Playground?
Well imagine you're scanning 100s or 1,000s of repos with Semgrep, and there's something new you'd like to enforce, whether it's a secure guardrail, a new anti-pattern you'd like to block, based on a recent penetration test report or bug bounty submission, etc.
So you quickly write the rule in the Playground, add it to one of your scanning policies, and then boom, that rule is immediately going to run on every new PR for repos using that policy.
No need to file PRs on hundreds of repos, no need to wait on developers or DevOps teams acting on your request, just quick security coverage, everywhere.
(Note: of course you want to roll out new rules carefully, to ensure they're high signal, don't bother our developer friends, etc.)
Time for the next rule writing challenge!
I've opened up a new PR with more code to match: click here to continue.
Welcome!
I'm excited you're here! π
Together we're going to see how we can quickly and easily set up continuous code scanning using Semgrep, an open source, lightweight static analysis tool.
We'll see how Semgrep's out-of-the-box rules can find and block a broad variety of vulnerabilities and enforce secure guardrails (also called "paved road" or "secure defaults").
We'll use the awesome OWASP Juice Shop project as the repo we'll scan, and we'll use GitHub Actions to scan every Pull Request (PR).
How This Lab Works
Basically, at each stage you'll be provided with some information, either as a GitHub issue, PR, or a comment on one of those.
Then, there'll be an
β¨οΈ Activity
section at the bottom, that has you complete some concrete steps, either in this repo (like editing files, opening or closing PRs or Issues) or on Semgrep-related sites (e.g writing new rules, setting up and configuring your dashboard, etc.).After you complete the steps in the Activity section, the bot will either autodetect what you've done and move you to the next step, or perhaps respond to a comment we ask you to write.
π‘ Important Notes
If at any point throughout this lab you're not seeing a bot response or scan update that you'd expect to, try refreshing the page, sometimes things get in a wonky state.
β¨οΈ Activity: See Docs Links
I'll respond in this pull request when I detect a comment posted to it.