Checkmarx / ast-azure-plugin

The CxAST Azure DevOps plugin enables you to trigger SAST, SCA, and KICS scans directly from an Azure DevOps pipeline.
https://marketplace.visualstudio.com/items?itemName=checkmarx.checkmarx-ast-azure-plugin
Apache License 2.0
4 stars 2 forks source link

[BUG] Cancelled jobs do not clean-up temp files #500

Open scotty6435 opened 5 months ago

scotty6435 commented 5 months ago

Describe the bug

When a job is cancelled (in our case because a new commit was added to the PR, triggering another build), the temp files generated by the job are not being cleaned up, causing our agent to fail.

Checkmarx's behaviour of zipping up the .git folders means that the zip generated by one of our codebases is very large (~3GB) so our /tmp/ space has to be enormous to handle this. As we're running inside containers, having persistent volumes or large temporary volumes is very inefficient for the cluster. Our widespread usage of Checkmarx within the company also means that this problem scales out very quickly.

Expected behavior

When a job terminates for any reason (success, failure, cancelled etc) the temp files generated by the task should always be processed.

Actual behavior

The files remain in place, causing our self-hosted ADO agents to run out of disk space and error.

Steps to reproduce

  1. Create/select a very large SCM repo
  2. Run a Checkmarx scan on an agent with a small /tmp reserve
  3. After the job creates the zip file, cancel the job
  4. Repeat as necessary until the agent fails.

Environment

Additional comments

Add any other context about the problem here.

Logs

Screenshot of the files left in the temp folder. The top is from a cancelled job, the second from a job that succeeded image

2024/04/17 11:14:34 Scan status:  Running
##[error]The Operation will be canceled. The next steps may not contain expected logs.
##[error]The operation was canceled.
Finishing: CheckmarxAST
##[error]Failed to create CoreCLR, HRESULT: 0x80004005
,##[error]We stopped hearing from agent ci-ado-agent-dev-43cee. Verify the agent machine is running and has a healthy network connection. Anything that terminates an agent process, starves it for CPU, or blocks its network access can cause this error. For more information, see: https://go.microsoft.com/fwlink/?linkid=846610
github-actions[bot] commented 5 months ago

Internal Jira issue: AST-40077