dependency-check / azuredevops

Dependency Check Azure DevOps Extension
Apache License 2.0
44 stars 26 forks source link

Question regarding easy caching approach #142

Open cyberblast opened 7 months ago

cyberblast commented 7 months ago

Hi,

obviously loading the whole NVD database for every pipe run is a bad idea. So I thought how to improve it without requiring too much effort or even costs for hosting etc.

Then I came across the --data CLI argument. Using that, we could easily use Azure DevOps Cache task to cache/restore the data.

But I'm wondering if that is a valid approach, as the description for the argument says: "This option should generally not be set." Also that approach is suggested nowhere.

Anyway, I started to implement it like this, but unfortunately I'm unable to test it currently due to some issues on NVD API side of things (HTTP 503).

Any idea if that should work or not at all or if there is any reason why it should not be done like this?

steps:
- task: Cache@2
  displayName: ODC NVD Database Cache
  inputs:
    key: 'ODCNVD | "$(Agent.OS)"'
    path: $(Pipeline.Workspace)/odc/data

- task: dependency-check-build-task@6
  displayName: 'OWASP Dependency Check'
  continueOnError: ${{ parameters.warningOnly }}
  inputs:
    projectName: ${{ parameters.projectName }}
    scanPath: ${{ parameters.scanPath }}
    format: ${{ parameters.format }}
    enableVerbose: ${{ parameters.verbose }}
    failOnCVSS: ${{ parameters.cvssThreshold }}
    warnOnCVSSViolation: ${{ parameters.warningOnly }}
    additionalArguments: --nvdApiKey <secret> --data $(Pipeline.Workspace)/odc/data ${{ parameters.additionalArguments }}
jeremylong commented 7 months ago

Yes, several people use this option. They build a database on one node, save the database, and then copy the DB to any other node that is running ODC and simply use the --noupdate option. I suppose the docs should be updated for this use case.

cyberblast commented 7 months ago

great 😄 thank you for the confirmation

pippolino commented 7 months ago

In @cyberblast's example, however, --noupdate is not used and therefore it always downloads the data or if --noupdate was used then I would not be sure whether the updated db is in the cache. There should be a daily triggered pipeline for update only the cache.

cyberblast commented 7 months ago

I added --nvdValidForHours with a high value instead. I agree, a dedicated update pipe may be more reliable for concurrency reasons. But in general I guess it should work already.

cyberblast commented 7 months ago

It seems I forgot about Azure Cache tasks hard wired and strict scoping, which makes sharing of cache between pipes impossible 😞 Will look into setting up a dedicated NVD DB update pipe instead...

cyberblast commented 7 months ago

Hi, not 100% sure how much it makes sense currently, as ODC is down due to NVD issues, but I'd like to share some implementation aproach to run a dedicated pipeline to update NVD Database and utilize it in other pipes for ODC execution.

Anyway, I believe it may give a good starting point for anybody implementing a similar aproach.

NVD Update Pipe

parameters:
- name: purge
  displayName: Purge Database
  type: boolean
  default: false
- name: verbose
  displayName: Verbose
  type: boolean
  default: false
- name: nvdValidForHours
  displayName: NVD valid for hours
  type: number
  default: 23
- name: additionalArguments
  displayName: Additional arguments
  type: string
  default: ' '
- name: nvdApiKey
  displayName: NVD API key
  type: string

trigger: none
schedules:
- cron: '0 0 * * *'
  displayName: Daily midnight run
  branches:
    include:
    - master

stages:
- stage: update_odc_nvd
  displayName: nvd.nist.gov
  pool:
    name: Azure Pipelines
    vmImage: ubuntu-latest
  jobs:
  - job: build
    workspace:
      clean: outputs
    displayName: Update NIST NVD
    steps:
    - checkout: none
    - task: Cache@2
      displayName: ODC Cache
      inputs:
        key: 'ODC | "$(Agent.OS)"'
        path: $(Pipeline.Workspace)/odc/app
    - task: Cache@2
      displayName: NVD Cache
      inputs:
        key: 'NVD | "$(Agent.OS)"'
        path: $(Pipeline.Workspace)/odc/data
    - bash: |
        set -x # echo on
        VERSION=$(curl -s https://jeremylong.github.io/DependencyCheck/current.txt)

        if [ ! -d "$(Pipeline.Workspace)/odc/app/$VERSION" ]; then
          rm -rf $(Pipeline.Workspace)/odc/app/*
          mkdir -p $(Pipeline.Workspace)/odc/app/$VERSION
          curl -Ls "https://github.com/jeremylong/DependencyCheck/releases/download/v$VERSION/dependency-check-$VERSION-release.zip" --output dependency-check.zip
          unzip -uq ./dependency-check.zip -d $(Pipeline.Workspace)/odc/app/$VERSION
        fi

        $(Pipeline.Workspace)/odc/app/$VERSION/dependency-check/bin/dependency-check.sh --updateonly --nvdApiKey ${{ parameters.nvdApiKey }} --data $(Pipeline.Workspace)/odc/data --nvdValidForHours ${{ parameters.nvdValidForHours }} $PURGE ${{ parameters.additionalArguments }}
      displayName: Update NVD
      env:
        ${{ if eq( parameters.purge, true ) }}:
          PURGE: '--purge'
        ${{ else }}:
          PURGE: ''
    - task: ArchiveFiles@2
      displayName: Compress NVD Artifact
      inputs:
        rootFolderOrFile: '$(Pipeline.Workspace)/odc/data'
        includeRootFolder: false
        archiveFile: '$(Build.ArtifactStagingDirectory)/NVD.zip'
    - task: PublishPipelineArtifact@1
      displayName: Publish NVD Artifact
      inputs:
        targetPath: '$(Build.ArtifactStagingDirectory)/NVD.zip'
        artifact: 'NVD'
        publishLocation: 'pipeline'

Please be aware that I'm still not 100% sure if this code works well, as NVD DB is currently unavailable. Also, your API Key gets exposed to the logs.

Also, this code doesn't make use of the Azure DevOps Marketplace task "dependency-check-build-task@6". I started with it, but you need to add additional parameters and other mandatory parameters of the task are not needed at all for this use case. So I decided to get rid of it eventually.

pippolino commented 7 months ago

Ciao @cyberblast, the pipeline seems correct to me, but then you need to download the artifact in all pipelines. Why don't you directly use the database for storage as mentioned here. I'm trying to do it right now. I use a dedicated pipeline for updating NVDs with maven plugin and then I have everything ready in all client pipelines.

cyberblast commented 7 months ago

Hi @pippolino, Thank you for the suggestion. yes sounds like a reasonable idea. However, that also means additional infrastructure setup and maintenance. Having an up to date pipe artifact managed completely within Azure DevOps pipes is much easier in our specific setup. At least for now.

cyberblast commented 7 months ago

Hi, only wanted to give a short feedback that the above code works very well now as the issue with getting NVD API queried has been solved. Maybe its helps someone to set it up.

For completeness I'm also pasting consumer code, executing ODC. It's a task template.

parameters:
- name: verbose
  type: boolean
  default: false
- name: projectName
  type: string
  default: 'OWASP'
- name: scanPath
  type: string
  default: './'
- name: warningOnly
  type: boolean
  default: false
- name: additionalArguments
  type: string
  default: ''
- name: cvssThreshold
  type: number
  default: '4'
- name: format
  type: string
  default: 'HTML, JUNIT, JSON'
- name: publishTestResults
  type: boolean
  default: true
- name: NistNvdTeamProject
  type: string
  default: '<Name of DevOps Project>'
- name: NistNvdPipeId
  type: string
  default: '<Name of NVD DB Pipe>'
- name: NistNvdPipeBranch
  type: string
  default: 'refs/heads/master'
- name: NistNvdArtifactName
  type: string
  default: 'NVD'
- name: NistNvdFileName
  type: string
  default: 'NVD.zip'

steps:
- task: DownloadPipelineArtifact@2
  displayName: Download NVD Artifact
  continueOnError: ${{ parameters.warningOnly }}
  inputs:
    source: specific
    project: ${{ parameters.NistNvdTeamProject }}
    pipeline: ${{ parameters.NistNvdPipeId }}
    runVersion: latestFromBranch
    runBranch: ${{ parameters.NistNvdPipeBranch }}
    artifact: ${{ parameters.NistNvdArtifactName }}
    path: '$(Pipeline.Workspace)/odc'
- task: ExtractFiles@1
  displayName: Unpack NVD
  continueOnError: ${{ parameters.warningOnly }}
  inputs:
    archiveFilePatterns: '$(Pipeline.Workspace)/odc/${{ parameters.NistNvdFileName }}'
    destinationFolder: '$(Pipeline.Workspace)/odc/data'
    overwriteExistingFiles: true 
- task: dependency-check-build-task@6
  displayName: 'OWASP Dependency Check'
  continueOnError: ${{ parameters.warningOnly }}
  condition: succeeded()
  inputs:
    projectName: ${{ parameters.projectName }}
    scanPath: ${{ parameters.scanPath }}
    format: ${{ parameters.format }}
    enableVerbose: ${{ parameters.verbose }}
    failOnCVSS: ${{ parameters.cvssThreshold }}
    warnOnCVSSViolation: ${{ parameters.warningOnly }}
    additionalArguments: --noupdate --data $(Pipeline.Workspace)/odc/data ${{ parameters.additionalArguments }}
- ${{ if eq(parameters.publishTestResults, true) }}:
  - task: PublishTestResults@2
    displayName: 'Publish ODC results'
    continueOnError: ${{ parameters.warningOnly }}
    condition: succeededOrFailed()
    inputs:
      testResultsFormat: 'JUnit'
      searchFolder: $(Common.TestResultsDirectory)
      testResultsFiles: 'dependency-check/*junit.xml'
      failTaskOnFailedTests: ${{ not(parameters.warningOnly) }}

To use the task in a pipe it can be done like this (here with pipe in same repo for C#):

- template: ../task/test-owasp-dependencies.yml
  parameters:
    scanPath: '**/*.csproj'
    warningOnly: true

and for npm (e.g. react):

- template: ../task/test-owasp-dependencies.yml
  parameters:
    scanPath: '**/yarn.lock'
    additionalArguments: '--scan "$(Build.SourcesDirectory)/**/package.json" --scan "$(Build.SourcesDirectory)/**/node_modules" --disableYarnAudit --nodeAuditSkipDevDependencies --nodePackageSkipDevDependencies'
    warningOnly: true

Please note that we are here disabling Yarn Audit (--disableYarnAudit) only because we are using yarn berry (v4) which seems to not work well with ODC currently. Most likely you can/should remove that flag...

thisjustin816 commented 7 months ago

I also used a similar approach after all of the issues. One suggestion for your pipeline is that you don't need to have tasks to archive and unarchive the files. Azure Pipeline Artifacts does all that already and has optimizations for uploading and downloading to skip redundant files.

Here's my pipeline that caches the data files. It runs every 4 hours to always have the latest NVD data while following their recommended best practice for frequency. The nvd and oss variables are stored as secret pipeline variables.

appendCommitMessageToRunName: false

trigger:
  batch: true
  branches:
    include:
    - '*'
  paths:
    include:
    - OwaspResourceDownload.yml

schedules:
- cron: '0 0,4,8,12,16,20 * * *'
  displayName: 'Q.4H Update'
  branches:
    include:
    - main
  always: true

variables:
  dependencyCheckVersion: latest

pool:
  vmImage: 'windows-latest'

stages:
- stage: update
  displayName: Update OWASP Dependency Check Data
  jobs:
  - job: update
    displayName: Update OWASP Dependency Check Data
    steps:
    - checkout: none

    - task: PowerShell@2
      displayName: Update Build Name
      inputs:
        targetType: 'inline'
        script: |
          # OWASP Dependency Check Version
          $latestOnlineVersion = Invoke-RestMethod -Uri 'https://jeremylong.github.io/DependencyCheck/current.txt'
          $odcVersion = if ($env:dependencyCheckVersion -eq 'latest' -and $latestOnlineVersion) {
              $latestOnlineVersion
          }
          else {
              $env:dependencyCheckVersion
          }
          Write-Host -Object "Dependency Check Version: $odcVersion"

          # NVD Last Change
          $headers = @{
              'Accept' = 'application/json'
              'apiKey' = $env:nvdApiKey
          }

          $startDate = ( Get-Date ).ToUniversalTime().AddHours(-4).ToString('o')
          $endDate = ( Get-Date ).ToUniversalTime().ToString('o') 

          $uri = "https://services.nvd.nist.gov/rest/json/cvehistory/2.0/?changeStartDate=$startDate&changeEndDate=$endDate"
          try {
              $lastChange = Invoke-RestMethod -Uri $uri -Headers $headers -ErrorAction Stop |
                  Select-Object -ExpandProperty cveChanges |
                  Select-Object -Last 1
              $nvcLastChangeTime = $lastChange.change.created | Get-Date -Format 'yyyyMMdd.HHmm'
          }
          catch {
              Write-Warning -Message "##[warning] Failed to get NVD Last Change: $($_.Exception.Message)"
              $nvcLastChangeTime = $endDate | Get-Date -Format 'yyyyMMdd.HHmm'
          }
          Write-Host -Object "NVD Last Change: $nvcLastChangeTime"

          Write-Host -Object "##vso[task.setvariable variable=nvcLastChangeTime;]$nvcLastChangeTime"
          Write-Host -Object "##vso[Build.UpdateBuildNumber]ODC-$($odcVersion)_NVD-$($nvcLastChangeTime)"

    - task: Cache@2
      inputs:
        key: 'owasp-dependency-check | data | "$(nvcLastChangeTime)"'
        path: '$(Pipeline.Workspace)/owasp-dependency-check-data'
        restoreKeys: 'owasp-dependency-check | data'

    - task: dependency-check-build-task@6
      displayName: OWASP Dependency Check
      retryCountOnTaskFailure: 1
      inputs:
        dependencyCheckVersion: $(dependencyCheckVersion)
        projectName: 'Update'
        scanPath: '$(Pipeline.Workspace)'
        additionalArguments: >
          --nvdApiKey $(nvdApiKey)
          --nvdApiDelay 6000
          --data "$(Pipeline.Workspace)/owasp-dependency-check-data"
          --ossIndexUsername $(ossIndexUsername)
          --ossIndexPassword $(ossIndexPassword)
          --updateonly

    - publish: $(Pipeline.Workspace)/owasp-dependency-check-data
      artifact: owasp-dependency-check-data

I then consume it with the following tasks (can't include the whole pipeline for IP reasons):

Declare the above pipeline as a resource:

resources:
  pipelines:
  - pipeline: OWASPResources
    source: OWASP Resource Download
    branch: main

I use variables for CVSS score and ODC version

variables:
  failOnCVSS: 7 # More info -> https://www.recordedfuture.com/cvss-scores-guide/
  dependencyCheckVersion: latest

And the steps, using the --data param for the resource artifact and --noupdate.

          steps:
          - download: OWASPResources
            artifact: owasp-dependency-check-data
            displayName: Download OWASP Dependency Check Data

          - task: dependency-check-build-task@6
            displayName: OWASP Dependency Check
            inputs:
              dependencyCheckVersion: $(dependencyCheckVersion)
              projectName: '${{ parameters.release }}'
              scanPath: '$(Pipeline.Workspace)/${{ parameters.release }}Artifact/${{ coalesce(parameters.artifactName, parameters.product, ''drop'') }}'
              format: 'HTML, JUNIT'
              failOnCVSS: '$(failOnCVSS)'
              suppressionPath: '$(Pipeline.Workspace)\owasp-suppression.xml'
              enableExperimental: ${{ parameters.enableExperimental }}
              additionalArguments: >
                --data "$(Pipeline.Workspace)/OWASPResources/owasp-dependency-check-data"
                --noupdate
omgdota123 commented 7 months ago

Hello @cyberblast.

After downloading the NVD.zip, i'm trying to run an ODC scan with maven plugin by providing -DautoUpdate=false and also -DdataDirectory=$(Pipeline.Workspace)/odc/data. But it keeps returning: NoDataException: Autoupdate is disabled and the database does not exist

I also tried to extract the NVD zip to $(Pipeline.Workspace)/.m2/repository/org/owasp/dependency-check-data/9.0.2 but still not working.

if by any chance you have an idea ;)

cyberblast commented 7 months ago

Hi,

@thisjustin816 thanks for sharing. Contains some interesting aspects. But also maybe depends a bit on usage scenario/environment. Will also look up again on the artifact topic. I wasn't aware of it.

@omgdota123 you need to extract it to the data directory $(Pipeline.Workspace)/odc/data, as described here.