microsoft / sarif-tools

A set of Python command line tools for working with SARIF files produced by code analysis tools
MIT License
90 stars 20 forks source link

Code block in sarif files incorrectly rendered in Summary #39

Closed Jiri-Stary closed 1 month ago

Jiri-Stary commented 12 months ago

1) have sarif file with complex message summary (4).zip

2) run summary command (sarif summary) 3) pipe to github step output or github command
4) results in text that is rendered in ugly way

- name: Install ms sarif tools
  if: ${{ always() }}
  shell: bash
  run: |
         mkdir sarif-tools
         git clone https://github.com/microsoft/sarif-tools.git
         cd ./sarif-tools
         pip install .
         cd ..

- name: View all issues from sarif files
  if: ${{ always() }}
  shell: bash
  run: |
         sarif summary ./${{inputs.report_location}} -o ./hdf/issues.txt
         cat ./hdf/issues.txt
         echo "Summary of issues found:" >> "${GITHUB_STEP_SUMMARY}"
         cat ./hdf/issues.txt >> "${GITHUB_STEP_SUMMARY}"

The result is broken code blocks: I assume there could be a way how to maybe format the message as block of text or as code block or customize it so it does not end up looking like this

image

balgillo commented 11 months ago

Hi, thanks for raising this. The sarif file has multiple lines of information in the "message text" entries e.g.

{"ruleId":"jssecurity:S5147","level":"error",
"message":{"text":"Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/Javascript/Identification and Authentication Failures/SensitiveDataExposurePassword.js:53:53 StartLine: 53, EndLine: 53<br>Code:<pre>const {username, password} = req.body;\n\n  User.findOne({ username: username }, (err, user) => {\n    if (err || !user || user.password !== password) {\n      res.status(401).send('Unauthorized');\n    } else {</pre>."},
"locations":[{"physicalLocation":{"artifactLocation":{"uri":"file:///","uriBaseId":"ROOTPATH","index":0},"region":{"startLine":1,"startColumn":1,"endLine":1,"endColumn":1}}}],"rank":1.0}

It's not a very well-formed SARIF file because as you can see, the region information hasn't been populated, but the region information is in the message text in plaintext. Nevertheless, we want to improve the tooling to handle this kind of file better.

At the moment, the tool combines the message text and the issue code (ruleId) to determine the issue type for the summary. If the message text is different, it considers it a different issue (e.g. maybe two tools use the same code for different issues). That logic doesn't work for this input where info about the specific occurrence of the issue is packed into the message text.

I have identified two possible changes to the tool behaviour: we can truncate the message text, which produces the following output for your input file:

error: 40
 - jssecurity:S5147 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/Javascript/Identification...: 2
 - jssecurity:S5147 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/Javascript/Security...: 2
 - java:S4830 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/Java/TLS/src/main/java/com/...: 2
 - javasecurity:S3649 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/Java/Sql...: 2
 - python:S4830 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/Python/TLS/okta_authenticat...: 2
 - phpsecurity:S5131 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/PHP/XSS/routing_layout.tpl....: 2
 - java:S6437 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/Java/sensitive data...: 2
 - javasecurity:S2083 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/Java/path...: 2
 - jssecurity:S5147 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/Javascript/Broken Access...: 1
 - jssecurity:S5147 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/Javascript/Injection/PathTr...: 1
 - jssecurity:S5147 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/Javascript/Software and...: 1
 - jssecurity:S3649 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/Javascript/Injection/SQLInj...: 1
 - terraform:S4423 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:Secrets/Access Tokens/Azure/main.tf:102:102...: 1
 - secrets:S6290 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:InfrastructureCode/Terraform/ec2.tf:15:15...: 1
 - secrets:S6290 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:InfrastructureCode/Terraform/ec2.tf:16:16...: 1
 - secrets:S6290 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:Secrets/Access Tokens/AWS/database.py:8:8...: 1
 - java:S5527 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/Java/TLS/src/main/java/com/...: 1
 - secrets:S6689 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/Java/sensitive data...: 1
 - python:S4423 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/Python/TLS/tls-...: 1
 - pythonsecurity:S2083 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/Python/Path...: 1
 - phpsecurity:S5135 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/PHP/Insecure...: 1
 - python:S4830 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/Python/Sensitive Data...: 1
 - python:S4830 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/Python/TLS/post_rest_api.py...: 1
 - pythonsecurity:S5131 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/Python/SSRF/ssrf.py:103:103...: 1
 - phpsecurity:S5146 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/PHP/Open Redirect/redirect-...: 1
 - phpsecurity:S2083 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/PHP/Path Traversal/path-...: 1
 - phpsecurity:S2083 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/PHP/SSRF/generated-...: 1
 - php:S6437 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/PHP/SQL Injection/sqli-...: 1
 - phpsecurity:S3649 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/PHP/SQL Injection/sqli-...: 1
 - phpsecurity:S5131 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/PHP/XSS/xss-...: 1
 - pythonsecurity:S5146 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/Python/Open...: 1
 - java:S4423 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/Java/TLS/src/main/java/com/...: 1

warning: 7
 - javasecurity:S5145 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/Java/Sql...: 2
 - jssecurity:S5144 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/Javascript/Security...: 1
 - jssecurity:S5144 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/Javascript/Server-Side...: 1
 - terraform:S6321 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:InfrastructureCode/Terraform/ec2.tf:94:95...: 1
 - javasecurity:S5144 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/Java/xss/src/main/java/com/...: 1
 - pythonsecurity:S5144 Path:15596-McK-Internal_SF-SE-vulnerable_code_repository:ApplicationCode/Python/SSRF/ssrf.py:141:141...: 1

note: 0

Or we can ignore the message text for grouping, which produces the following output:

error: 40
 - jssecurity:S5147: 7
 - python:S4830: 4
 - secrets:S6290: 3
 - phpsecurity:S5131: 3
 - java:S4830: 2
 - javasecurity:S3649: 2
 - phpsecurity:S2083: 2
 - java:S6437: 2
 - javasecurity:S2083: 2
 - jssecurity:S3649: 1
 - terraform:S4423: 1
 - java:S5527: 1
 - secrets:S6689: 1
 - python:S4423: 1
 - pythonsecurity:S2083: 1
 - phpsecurity:S5135: 1
 - pythonsecurity:S5131: 1
 - phpsecurity:S5146: 1
 - php:S6437: 1
 - phpsecurity:S3649: 1
 - pythonsecurity:S5146: 1
 - java:S4423: 1

warning: 7
 - jssecurity:S5144: 2
 - javasecurity:S5145: 2
 - terraform:S6321: 1
 - javasecurity:S5144: 1
 - pythonsecurity:S5144: 1

note: 0

Which of these would you prefer in your case?

Jiri-Stary commented 11 months ago

I see,

There might be an issue with the SARIF file, since the way it is created is though double conversion. Sonarqube does not support sarif output natively, so i am using mitre saf to get an HDF file and then using HDF-> SARIF convertor.

Honestly i dont like neither of those options since they are not particularly helpful for fixing the issues. Let me have a look if i can identify issue in the conversion to SARIF

balgillo commented 11 months ago

OK, makes sense. The summary command isn't really designed to help with fixing the issues, it's just there to give a quick overview of how many issues of each type exist for monitoring and insight purposes. I'd recommend the csv, html or word commands which provide location info for each issue occurrence.

Jiri-Stary commented 11 months ago

i will take a look. Does the summary command process < br > or < pre > tags in the description or threats everything as plaintext?

balgillo commented 11 months ago

It treats everything as plaintext

balgillo commented 1 month ago

This has been improved in v3.0.1, which is now released on PyPI. Long summaries are truncated for better display in summaries.