Netflix-Skunkworks / Scumblr

Web framework that allows performing periodic syncs of data sources and performing analysis on the identified results
Apache License 2.0
2.64k stars 317 forks source link

Tasks don't de-duplicate after re-running #118

Closed KevinHock closed 7 years ago

KevinHock commented 7 years ago

Hi All,

So if I do say, a "Github Code Search" and hit Run Now it will add N to "Vulnerabilities (N Open Findings)" each time I run it, without taking into account the existing Vulnerabilities (i.e. running it 5 times give you N*5 vulns.)

I tried to fix a different issue with the vulnerability_helper, it would mark things as duplicates that weren't (this was from using the python_analyzer [bandit wrapper].)

else
     self.metadata["vulnerabilities"].select{|v| v["url"] == vulnerability.url && v["type"] == vulnerability.type}

here the type is the issue_text from Bandit https://github.com/KevinHock/Scumblr/blob/master/lib/scumblr_tasks/security/python_analyzer.rb#L154 and url is nil I believe. So it doesn't report more than 1 vuln of the same vulnerability type.

I tried to fix this via

    elsif vulnerability.source == "Bandit"
       self.metadata["vulnerabilities"].select{|v| v["source_code_file"] == vulnerability.source_code_file && v["source_code_line"] == vulnerability.source_code_line && v["type"] == vulnerability.type}

https://github.com/Netflix/Scumblr/pull/117/commits/fc6f1bf864a36d4dad12d5f09b9f845143d7281f#diff-bb43fb35a2e987152be9d386b169d54bR338

but then I realized that now it won't take existing vulnerabilities into account. (Unless I manually run it in the rails console and not through the Run Now button, for some reason?)

Before I made this change the Run Now for python_analyzer never added N to the "Vulnerabilities (N Open Findings)" when they were already there, but that doesn't make sense to me. It seems before my change that self.metadata["vulnerabilities"] held all the vulns in the DB and now it doesn't, unless I'm in rails console.

sbehrens commented 7 years ago

Hi @KevinHock,

@sk3tch and I will take a look. If you are seeing the correct behavior from rails console but not when clicking run now, you may need to restart sidekiq.

KevinHock commented 7 years ago

That's interesting, I may or may not have done that, but I think I did. I'll confirm and get back to you, I'm not a Rails guy so it's probable my error.

There's also another issue but one that I guess I'll fork bandit for: What if the file is edited and the line number changes? Well we can try to (1) if the number of type X vulnerabilities in file A changes, then alert, this seems :/. Or (2) we can hash the line of the vuln + line above the vuln + line below the vuln and see if it changes, which is better than (1). Option 3 is we don't play this game, I write a tool to solely analyze GitHub Pull Requests, which in reality is my waiting for my ex-coworker to open-source his code for doing this and adding on top of it http://sandiego.toorcon.net/conference/#26.

I'm going to ask other friends how they solve the simple line number changes problem, it seems like a simple (boring) problem every security engineer ever has run into.

Thanks for making Scumblr! Cheers, Kevin

sk3tch commented 7 years ago

@KevinHock I've tried several times to reproduce this issue and haven't been able to. Are you seeing this on. Is there a public repo you could point me to that you're having issues with so I can try it from my side?

Thanks

sbehrens commented 7 years ago

Hi @KevinHock,

We weren't able to reproduce this issue. Please re-open the issue if you are still experience the problem.

Thanks,

Scott