Closed NRGLine4Sec closed 5 years ago
The logic is in utils.py#sync_dedupe method: https://github.com/DefectDojo/django-DefectDojo/blob/master/dojo/utils.py. That might help (if you can read Python :))
Sorry, I'm not a developer so I understand a little bit Python and I haven't seen any line of code allows to check if a same vulnerability (like same CVE) is discovered by Nexpose and by OpenVAS for example. And especially it doesn’t tell me how to do it in DefectDojo. I would at least like to know if it is possible to import a multitude of reports from several scanners to the same target and to obtain a result of vulnerabilities without having duplicates in vulnerabilities discovered by all these scanners.
Hey , I was thinking of re-writing the logic behind deduplication using signals as suggested by @aaronweaver , the current deduplication works based on a multitude of elements but they are all specific to the scanner , like xss as reported by burp will have all of these elements different from an xss reported by nessus , this is an option that needs to be explored though .
So it is actually impossible to import report of OpenVAS and Nexpose or Nessus and to obtain a result of vulnerability without duplicates ? That is to say to obtain the correlated result of the vulnerabilities of several scanners without duplicates.
There is a pull request (#973 ) coming which adds CVE field, I think that might also help to have better deduplication.
I think that based on CWE and endpoint match we could have a decent enough dedupe process , the only issue is that currently most tools don't give a CWE field in the report or not one that is easy to parse (burp suite for example only sometimes gives a CWE-id or sometimes gives more than one) which would imply that we'd need to automatically populate the CWE-id field . CVE field I find is a bit finicky for dedupes ? @aaronweaver @devGregA Any ideas ?
Maybe some of these open source projects can help you https://github.com/archerysec/archerysec https://github.com/infobyte/faraday https://github.com/dradis/ https://github.com/cornerpirate/ReportCompilerSource https://github.com/denimgroup/threadfix
I think in general if two findings have the exact same CVE it's almost always a duplicate?
What I think might help is to have a manual "mark as duplicate" functionality", similar to the merge function? Deduplication across different scanners will probably never be 100% accurate. A manual "mark as duplicate" function would allow an analyst to mark one issue as duplicate of another and after that DD can it's usual magic and perform deduplication on reruns, reports, etc.
I've actually been thinking of that myself , I'll be pushing an update to fix deduplication tonight/tomorrow morning and then maybe look into adding that options as well
But with a manual "mark as duplicate functionality", it's complicated to deduplicate the the result of the import of two reports with each about two hundred discovery vulnerabilities. They obviously apparently to do it with ThreatFix, so it must be possible to correlate the reports of several scanners without duplicates. Maybe with a parser who check for the id field of vulnerability (CVE, CWE,...) of all reports generated by all scanners taken into account by DefectDojo and correlate the result with one id of vulnerability ?
@NRGLine4Sec the manual dupe is only for edge cases and I think we need to think that one through. For dedupe we go with what the scanner exports and for some it's not as complete as others. Ideally it would be CVE + endpoint and possibly CWE's for the dupe match. We go through several iterations, so if the hash matches, then it's marked as a dupe, if not then it iterates through the endpoints and does comparisons. We've spent a fair amount of time on web scanners and not as much on the network type scanners. Some scanners will merge better than others based on what data the scanner exports.
Did you work on the deduplication with the result of different scanner ?
I think, it could be a good idea to add a field in the result when a vulnerability is discovered by different tool.
For example, if I have a vulnerability discovered by Nexpose and by OpenVAS, the Found By
field could be like Nexpose, OpenVAS
to know that the vulnerability is discovered by two different scanners.
The Mitigation
and References
field could also be improved by the information obtain in the report of Nexpose and in the report of OpenVAS to correlate these informations without dupplicate.
@aaronweaver ?
@NRGLine4Sec If you can show a sample of two identical findings from Nexpose and OpenVAS. My guess is that the endpoint and CWE aren't matching as d-edupe looks at that to figure it out.
How to deduplicate the result of a scan with OpenVAS and Nexpose on the same target ? When I import the OpenVAS report and the Nexpose report, the vulnerability found are detected as different vulnerability or most of them are the same.