Add extra details in the results (workflow, job, step)

sadreck commented 4 months ago

Is your feature request related to a problem? Please describe. At the moment when results are reported, the report only includes the offending sink workflow/action file. For instance, if you run Raven against microsoft/graphrag one of the results will be:

Name: Unpinnable Action
Severity: low
Description: Unpinnable actions can lead to software supply chain attacks.
Tags: ['supply-chain', 'best-practice']
Workflow URLS:
- https://github.com/pypa/gh-action-pypi-publish/tree/unstable/v1/action.yml

That workflow URL does not belong to microsoft/graphrag which makes is difficult to answer "what do I need to fix, and where do I find what's calling this?"

Describe the solution you'd like It would be nice if the following information would be displayed as well:

Caller Repo Workflow Url
Job Name
Step Name

For instance, the unpinnable-action query from:

MATCH (ca:CompositeAction)
  WHERE (
      ca.using = "docker" AND (
          NOT ca.image CONTAINS "@sha256:"
      )
  )
  RETURN DISTINCT ca.url AS url;

would become

MATCH (w:Workflow)-[*]->(j:Job)-[*]->(s:Step)-[*]->(ca:CompositeAction)
  WHERE (
      ca.using = "docker" AND (
          NOT ca.image CONTAINS "@sha256:"
      )
  )
  RETURN DISTINCT ca.url AS vulnerable_url, w.path AS workflow_url, j.name AS job, s.name AS step

Additional context

I'm happy to submit a PR for this, but thought to raise this issue as it will include a significant refactor of the existing code to accommodate it.

elad-pticha commented 4 months ago

Hey, That's a good idea. We need to think of a nice way to show all those details in Raven output.

Any suggestions?

sadreck commented 4 months ago

In its current form Raven is grouping its findings by vulnerability type. The goal is to have a list of findings which the dev team can use to remediate and reflect straight away when an issue is remediated.

The caller worklow, job, step, and sink, will be visible in the report. I was thinking something in the url@job:step format.
Issues will have to be reported individually per combination. For instance if one workflow has 2 vulnerabilities it should appear once per vulnerability in the report. Also, if 2 workflows have the same type of vulnerability they should also appear twice - effectively the "primary key" will be repo-workflow:job-name:step-name:vulnerability-name.

The raw output would look something like this:

Name: Unpinnable Action
Severity: low
Description: Unpinnable actions can lead to software supply chain attacks.
Tags: ['supply-chain', 'best-practice']
Repo Workflow:
- https://github.com/microsoft/example/tree/main/.github/my-workflow.yaml@job-name:step-name
Vulnerable Workflow/Action:
- https://github.com/example/vuln-repo/tree/main/action.yml

Name: Unpinnable Action
Severity: low
Description: Unpinnable actions can lead to software supply chain attacks.
Tags: ['supply-chain', 'best-practice']
Repo Workflow:
- https://github.com/microsoft/example/tree/main/.github/my-workflow.yaml@another-job-name:another-step-name
Vulnerable Workflow/Action:
- https://github.com/example/vuln-repo/tree/main/action.yml

Name: Unpinnable Action
Severity: low
Description: Unpinnable actions can lead to software supply chain attacks.
Tags: ['supply-chain', 'best-practice']
Repo Workflow:
- https://github.com/microsoft/example/tree/main/.github/another-workflow.yaml@job-name:step-name
Vulnerable Workflow/Action:
- https://github.com/example/vuln-repo/tree/main/action.yml

I believe this format will make it easier to identify what needs to be fixed, and also will pave the way to upload sarif results to GHAS and track vulnerabilities from there.

CycodeLabs / raven

Add extra details in the results (workflow, job, step) #188