Events for artifact repositories

afrittoli commented 1 year ago

Today CDEvents only supports one event, artifact.published, which is meant to be produced by artifact repositories.

Repositories usually support more kind of events, for events like artifact.pulled, artifact.deleted and artifact.scanned, see the Harbor docs for an example.

We should consider extending the data model to include such events.

Design doc: https://hackmd.io/AfT-5D3JQZynKk5yDyAWeA

afrittoli commented 1 year ago

More details about the Harbor case: https://github.com/goharbor/community/pull/229

afrittoli commented 1 year ago

mekhanique commented 1 year ago

Suggestion: add event for vulnerability found and artifact quarantined. It should support CVE URL, SBOM URL, VeX URL, Vulnerability level (severe, etc)

afrittoli commented 10 months ago

Suggestion: add event for vulnerability found and artifact quarantined. It should support CVE URL, SBOM URL, VeX URL, Vulnerability level (severe, etc)

Thank you for the idea. Are you suggesting that should model a vulnerability as a subject with its own predicates, so that such events could be produced by public registries of software vulnerabilities such as CVE? It could be an improvement compared to https://www.cve.org/ResourcesSupport/FAQs#pc_cve_list_basicscve_list_data_feeds

In terms of events produced by artifact registries (or build systems or test systems), I was thinking of adding an artifact.scanned event, which would include the result of the scan. For the SBOM URI, I have a PR up already, adding that field to artifact.packaged and artifact.published, although it could be part of an artifact.scanned events too.

xbcsmith commented 10 months ago

From the CDEvents Workgroup:

Discussions about what scanning an artifact means and how it should be noted in events. Different scan types need to be possible to provide.

Do we need a moniker for the type of scan performed?
- Examples:
- artifact.scanned.sast
- artifact.scanned.sca
- artifact.scanned.oci
- artifact.scanned.dast
Do we separate our "source code" scans from "binary" scans. For example SAST and SCA scans are typically source code only scans where an OCI or DAST scan is a scan of the static container or a running application in the container respectively.
Use cases for scanning of artifacts:
- Source code scans : SAST and SCA
- The SCA scans are also used in compliance efforts. For example we use the SCA scans of the Source code to generate SBOMs (CycloneDX v1.4) for the customers. We also use the SCA scans to provide a Third Party Licensing report for the open source and proprietary third party software we ship with our products.
- Binary scans (source code built into binary): OCI (container scans), rpm scans, deb scans, etc... where for example an OCI scan looks at the external binaries pulled in to the OCI container that are not part of the source code checking them for version, license, and vulnerabilities.
- Application Scans : API security scans, DAST scans, and possibly Penetration testing results.
Scanning is made for example to find CVEs. 'binary.analyzed' might be an option, to also cater for other types of binary analyses
Also source code, Docker base images and deployments are scanned at SAS. Would it be better to have a 'scan.performed' event instead, to not limit to binaries only?
At SAS we have events for Scanned and events for the Audit of the scan.
How do we link scan types to artifact types to preserve the chain of events in an Artifacts lifecycle.
The links proposal should be able to help declaring what artifacts has passed a certain scanning step or not, etc.

xbcsmith commented 10 months ago

I realize these may be supported elsewhere in the spec. Please disregard (or annotate) duplication.

Possible list of event types to support for artifacts:

Build (OCI, RPM, DEB, etc..)
- Source code has been built into a binary
Scan (SAST, SCA, binary, DAST)
- Source code has been scanned
- Binary has been scanned
- Application has been scanned
Audit (CVEs, CVSS Scores)
- Scan has been audited
Compliance (SBOM, VeX)
- Documents have been created
Test (Xunit)
- Tests have been completed
Publish (OCI Registry, RepoMD, Apt, Maven...)
- Artifact has been published

mekhanique commented 10 months ago

Apologies for the delayed response @afrittoli , had some emergency health issues in the family that required my attention. Also, I may have missed some of these things in the spec already, so apologies if I re-hash covered ground.

Are you suggesting that should model a vulnerability as a subject with its own predicates, so that such events could be produced by public registries of software vulnerabilities such as CVE?

Hadn't thought about it like that, but now that you mention it, yes -- I think that would be a great idea.

A moniker for the test type performed would definitely be worthwhile as it can be used in promotion pipeline; which would, IMO, necessitate the creation of an artifact.promoted event. This would be useful in the overall pipeline automation IMO.

I'd also suggest that Test (Xunit) event type identified by @xbcsmith should be expanded to include additional categories such as the following:

Unit
Fuzz Test
Smoke / Build Verification Test
Integration Test
Functional Test
Regression Test
API Test
Penetration Test
Performance / Stress Load Test
Endurance Load Test
Chaos / Resiliency Test

Obviously not all test types apply to all artifacts. There may be others that should be here as well. I could see a fan-out/fan-in pipeline structure that supported different levels of potential artifact promotion based on success or failure of various test type completion results.

I'd also suggest that Compliance have the ability to indicate that a VeX has been updated, rather than just created, due to a later CVE finding. For example, continuous dependency monitoring systems like OWASP's Dependency Track can identify and potentially create such events longer term. This is also where a CVE subject could inform other systems of the increased vulnerability rating for running or released artifacts; possibly extending in to the Cloud Events space. This would also seem to suggest, at least to me, that we should probably include a scan audit date to help ensure we know when each event occurs.

Spitballing -- What y'all think about having/creating a vulnerability level associated with an artifact or process / service running (possibly bubbling up to a Cloud Event)? If it's tied to the highest level of vulnerability found in scans in some manner then an update of said level could help initiate other actions.

Thanks for listening! I hope the information is helpful. Looking forward to hearing y'alls thoughts.

mekhanique commented 10 months ago

Separately, the artifact pulled event matches, I think, what I described as artifact.quarantined. The difference, at least to me, is that quarantined makes clear that the artifact exists, but is not useable. Pulled, IMO, leaves its status a bit more murky -- and potentially confusion as to use. You wouldn't want to simply remove an artifact that has a security, legal or supportability issue; you would want to keep them to ensure they're not accidentally "re-deployed", by keeping the artifact namespace in place, but the artifact itself non-viable (i.e. quarantined) then accidental issues are much less likely to apply (unless your artifact repository allows overwrites 🙅). That distinction make sense @afrittoli?

e-backmark-ericsson commented 10 months ago

I have a feeling that this issue is taking off in many different directions, with many good and interesting proposals in it. We should probably streamline this discussion into multiple dedicated issues.

When it comes to artifact.pulled it seems I got the wrong impression of that term. I though it meant that some system had downloaded the artifact. But with @mekhanique 's comment above it seems the intention is to notify that an artifact should not be used anymore. That makes total sense to me, and then I believe that the term artifact.quarantined is easier to understand.

To add to the web of thoughts, long time ago we discussed events declaring a confidence label (https://github.com/cdfoundation/sig-events/discussions/37) and we've also used the term maturity level in some chats. The ideas in this ticket about a vulnerability level seems related to that. My thought on a confidence label is that the same event could be used to declare the confidence for any kind of item in a CI/CD pipeline, being it a source change, an artifact, a deployment or something else. And with the soon to be delivered connecting events (#139) we should have a good way of relating such a confidence label (or whatever the term will be) to such items.

mekhanique commented 10 months ago

@afrittoli corrected my understanding of artifact.pulled to mean that the artifact is downloaded (see https://github.com/cdevents/spec/issues/144#issuecomment-1841265062). I suggest artifact.downloaded rather than artifact.pulled and that we keep artifact.quarantined for the purpose I described above. More details on how artifact.quarantined can work are in the above link for #144

cdevents / spec

Events for artifact repositories #143