DefectDojo / django-DefectDojo

DevSecOps, ASPM, Vulnerability Management. All on one platform.
https://defectdojo.com
BSD 3-Clause "New" or "Revised" License
3.64k stars 1.52k forks source link

Update deduplication fields for Trufflehog (and other scanners...) #10271

Open brieucR opened 4 months ago

brieucR commented 4 months ago

⚠️ Is your feature request related to a problem? Please describe

I'm always frustrated with deduplication with Trufflehog parser. Any change from Trufflehog scanner or Security Operators might impact the description field, which is used as a key for deduplication.

✔️ Describe the solution you'd like

As a Security Operator, I want to update the deduplication mechanism so that updating the description field won't impact duplicate issues.

💡 Describe alternatives you've considered

There are two main solutions, both of them consist of updating the hashcode configuration.

Solution A We can rely on the payload field which would be filled with the Raw or RawV2 field from Trufflehog. Adding the file_path, would ensure that we capture several findings if a same secret is found across several files/repositories:

'Trufflehog Scan': ['payload', 'file_path']

Solution B ⭐ We can rely on the url field which would be filled with the link field from Trufflehog. The link value is a unique identifier: https://github.com/[organization]/[project]/blob/[commit_hash]/[file_path]#[line_number]

'Trufflehog Scan': ['url']

Additional context

Migration step payload and url are not sent to DefectDojo as a default. A migration step would require to calculate these fields for existing findings. Otherwise duplicates will be based on empty fields, causing a lot of unwanted duplicates.

Pros and Cons Also, each solution comes with both pros and cons:

brieucR commented 4 months ago

See https://github.com/DefectDojo/django-DefectDojo/pull/10118 for Solution B proposal