slsa-framework / slsa

Supply-chain Levels for Software Artifacts
https://slsa.dev
Other
1.49k stars 213 forks source link

Formulation and Pedigree in the context of SLSA #378

Open lumjjb opened 2 years ago

lumjjb commented 2 years ago

Created based on discussions in https://github.com/slsa-framework/slsa/pull/376#discussion_r861836404 and https://github.com/slsa-framework/slsa/pull/376#discussion_r861843826

Definitions from OWASP page:

Pedigree Data which describes the lineage for which software has been created or altered. Pedigree includes the ancestors, descendants, and variants which describe component lineage from any viewpoint and the commits, patches, and diffs which make a component unique. Maintaining accurate pedigree information is especially important with open source components whos source code is readily available, modifiable, and redistributable.

Formulation Formulation describes how components were built often including build system invocation and properties, SDK and compiler versions, compiler flags, and a comprehensive list of parallel and sequential steps that were taken to build, test, and deliver a component. Formulation and pedigree are complimentary concepts and are often combined and referred simply as pedigree.

Question

How do these map onto the SLSA context? and/or how/if SLSA can fill gaps if they exists.

MarkLodato commented 2 years ago

You can add "provenance" to the list as well:

Provenance A component’s provenance refers to the traceability of all authorship, build, release, packaging, and distribution across the entire supply chain. In physical supply chains this is referred to as the chain of custody. Provenance may include individual and community authorship of software components, manufacturers, suppliers, software repositories, and country of origin. For high assurance applications, provenance plays an important role in determining Foreign Ownership, Control, or Influence (FOCI).

What SLSA refers to simply as "provenance" encompasses all three OWASP concepts, at least partially. SLSA provenance is the full information of how a software artifact was produced:

SLSA provenance only covers modifications. It does not track "chain of custody", so to speak, because artifacts are immutably identified by hash. Where the artifact was copied from doesn't matter in SLSA's threat model.

stevespringett commented 1 year ago

I was recently interviewed by an external vendor at my employer as we are measuring the assurance of our software supply chain. The vendor chose to use SLSA. Since SLSA's definition of provenance does not align to the MITRE, OWASP, or English definition of the word, the interview was unproductive. The reviewer was unable to articulate what SLSA "provenance" meant in specific contexts, specifically globally accepted definitions of provenance, pedigree, and formulation.

Prior to the release of SLSA 1.0, I would highly encourage this group to align the specs vocabulary to MITRE or OWASP. It will dramatically improve auditor/auditee productivity and the vocabulary would be consistent with supply chain practitioners outside of the software industry.

BTW, to my understanding, the MITRE definition of pedigree includes formulation. In OWASP, formulation is a unique but related concept. This should be clarified by someone at MITRE, perhaps Bob Martin.

dlorenc commented 1 year ago

Here's one more definition set for the mix:

Provenance is the chronology of the origin, development, ownership, location, and changes to a system or system component and associated data. It may also include personnel and processes used to interact with or make modifications to the system, component, or associated data.

The validation of the internal composition and provenance of technologies, products, and services is referred to as the pedigree.

From NIST 800-53.

mlieberman85 commented 1 year ago

There are a few additional definitions NIST has: https://csrc.nist.gov/glossary/term/provenance

Metadata pertaining to the origination or source of specified data.

and

The records describing the possession of, and changes to, components, component processes, information, systems, organization, and organizational processes. Provenance enables all changes to the baselines of components, component processes, information, systems, organizations, and organizational processes, to be reported to specific actors, functions, locales, or activities.