SARIF output not compliant to specification

StefanFl commented 3 years ago

First of all thank you very much for your great tool.

I have tried to import the results of njsscan into DefectDojo (https://github.com/DefectDojo/django-DefectDojo) with the SARIF format. This seemed to have worked first, but then I realized a problem. Only one out of several SQL injections was stored in DefectDojo. This is due to the fact that njsscan's SARIF file has one result for the SQL injection with several physical locations. Due to the SARIF specification (https://docs.oasis-open.org/sarif/sarif/v2.1.0/os/sarif-v2.1.0-os.html#_Toc34317650) this is not the correct usage:

The locations array SHALL NOT be used to specify distinct occurrences of the same result which can be corrected independently.

EXAMPLE 3: Consider an analysis tool which locates misspelled words in documentation, and suppose this tool scans a document in which the same word is misspelled in two distinct locations. Then the resulting log file must contain two distinct result objects each of which contains a locations array containing a single location object specifying the location of one instance of the misspelled word.

EXAMPLE 4: In contrast, consider a tool which locates misspelled words in variable names. If the tool detects a misspelled variable name, it might produce a single result object whose locations array contains the location of every reference to the variable, since fixing some but not all of the references would cause a compilation error.

So you should have one result for each rule violation with one physical location, even if it is the same rule, to be compliant with the SARIF specification.

ajinabraham commented 3 years ago

I will take a look when I get some time.

ajinabraham commented 3 years ago

The njsscan SARIF output was validated to be used with Github Code Scanning using https://sarifweb.azurewebsites.net/Validation. Not sure if the proposed change will affect the same.

StefanFl commented 3 years ago

The SARIF output is syntactically correct, the SARIF file is allowed to have multiple physicalLocation objects per result. That's why the tools says the file is valid. But the specification is very clear, when you shall use multiple physicalLocations and when not, expressed in EXAMPLE 3 and EXAMPLE 4 (see above).

In our case (see njsscan.sarif.zip) we have the same rule (NodeSqliInjection) which finds vulnerabilities in 4 files that are not related to each other. That is similar to EXAMPLE 3 and so needs distinct result objects per case.

Because of this the import of the SARIF file in DefectDojo will only consider the first vulnerability and ignore the other 3, what makes the data unusable.

ajinabraham / njsscan

SARIF output not compliant to specification #76