awslabs / automated-security-helper

https://awslabs.github.io/automated-security-helper/
Apache License 2.0
372 stars 44 forks source link

Make Semgrep output more readable #27

Closed Misairuzame closed 7 months ago

Misairuzame commented 9 months ago

Description of changes: Make Semgrep output more readable by setting helper_dockerfiles/Dockerfile-grype's COLUMNS to 120.

Reasoning: By default, when run with ASH, Semgrep's output has a weird wrapping/indentation that makes it very hard to read. See this sample output:

┌─────────────────┐
│ 2 Code Findings │
└─────────────────┘

    api_call.py 
       python.lang.security.audit.insecure-transport.requests.request-with-http.
  request-with-http                                           
          Detected a request using 'http://'. This request will be unencrypted, 
  and attackers could                                                 
          listen into traffic on the network and be able to obtain sensitive    
  information. Use                                                              
          'https://' instead.                                                   
          Details: https://sg.run/W8J4                                          

           ▶▶┆ Autofix ▶ s/[Hh][Tt][Tt][Pp]:///https:///1
           43┆ response = requests.get(url, headers=headers)

    templates/search_form.html 
       go.lang.security.audit.xss.no-interpolation-in-tag.no-interpolation-in-ta
  g                                                                             
          Detected template variable interpolation in an HTML tag. This is      
  potentially vulnerable to                                                     
          cross-site scripting (XSS) attacks because a malicious actor has      
  control over HTML but                                                         
          without the need to use escaped characters. Use explicit tags instead.
          Details: https://sg.run/LwJJ                                          

          144┆ if (! searchYears.some((x) => x.value < {{maxyear}} && x.value > 
  2004-1)) {                                                                    

<<<<<< End Semgrep output <<<<<<

This happens because Semgrep, when the default output option is selected, wraps lines after 100 characters. Since it looks like a headless Docker container emulates a terminal that's 80 characters wide, Semgrep's output is wrapped at 80 characters (by the Docker headless terminal), and then again after the remaining 20 characters (by Semgrep), producing the output seen above.

By setting the COLUMNS environment variable in the relevant Dockerfile, i.e. helper_dockerfiles/Dockerfile-grype, to a number greater than 100 (such as 120), the output is no longer wrapped twice for each line, producing a much more readable output, as you can see from this sample:

┌─────────────────┐
│ 2 Code Findings │
└─────────────────┘

    api_call.py 
       python.lang.security.audit.insecure-transport.requests.request-with-http.request-with-http  
          Detected a request using 'http://'. This request will be unencrypted, and attackers could
          listen into traffic on the network and be able to obtain sensitive information. Use      
          'https://' instead.                                                                      
          Details: https://sg.run/W8J4                                                             

           ▶▶┆ Autofix ▶ s/[Hh][Tt][Tt][Pp]:///https:///1
           43┆ response = requests.get(url, headers=headers)

    templates/search_form.html 
       go.lang.security.audit.xss.no-interpolation-in-tag.no-interpolation-in-tag                   
          Detected template variable interpolation in an HTML tag. This is potentially vulnerable to
          cross-site scripting (XSS) attacks because a malicious actor has control over HTML but    
          without the need to use escaped characters. Use explicit tags instead.                    
          Details: https://sg.run/LwJJ                                                              

          144┆ if (! searchYears.some((x) => x.value < {{maxyear}} && x.value > 2004-1)) {

<<<<<< End Semgrep output <<<<<<

This makes Semgrep's output much more readable, expecially when there is a large number of findings, since it respects the intended indentation and wrapping.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Misairuzame commented 7 months ago

@climbertjh2, thank you for looking into this! I tried a few different tests on my setup (Linux, Kubuntu 22.04, Alacritty Terminal 0.13.0-dev, docker 25.0.3) and I can confirm the issue appears to be gone. I'm guessing something was fixed upstream, maybe in semgrep itself. When I submitted this PR semgrep was at version 1.50.0, while now pip installs version 1.60.1. I tried looking through semgrep changelogs and issues, and found this "recent" (Sept. 2023) issue mentioning a problem almost identical to mine: https://github.com/semgrep/semgrep/issues/8608 . In that issue they also mention that setting the COLUMNS env variable fixes the issue, at least in part. Regardless, by using the latest ASH commit the issue seems to be gone and this PR can most likely be closed. Maybe something to keep in mind for the future, if the issue were to be reintroduced. Thank you again for your time!

P.S.: when I did my previous tests, ASH was at commit 2bcd257b9a4dac1590e74b341b9482b53c638441 (and this repo's owner was called aws-samples instead of awslabs!). I tried again with that same commit and the issue does not appear. This makes me think even more that the issue was fixed (or something was changed) upstream.

climbertjh2 commented 7 months ago

@Misairuzame - thanks for checking into this again.

Since neither of us can re-create, I am closing this PR without merging it.

Thank you very much for your investigation and contribution!