guacsec / guac

GUAC aggregates software security metadata into a high fidelity graph database.
https://guac.sh
Apache License 2.0
1.26k stars 170 forks source link

[feature] Add proper DependencyType to isDependency node #1098

Open pxp928 opened 1 year ago

pxp928 commented 1 year ago

Is your feature request related to a problem? Please describe.

Currently, all the dependencies nodes are set to unknown for both SPDX and CDX. This makes it impossible to determine if the dependency is Direct or Indirect. If the SBOM tools generate an SBOM that provides this information accurately, the SPDX and CDX parsers need to be updated to capture this.

For SPDX we are only attaching every dependency to the top-level package node if it was determined via heuristics and does not have a proper relationship: https://github.com/guacsec/guac/blob/main/pkg/ingestor/parser/spdx/parse_spdx.go#L218-L222

For CDX we should do something similar also: https://github.com/guacsec/guac/blob/main/pkg/ingestor/parser/cyclonedx/parser_cyclonedx.go#L209-L216

Describe the solution you'd like Update the helper functions here: https://github.com/guacsec/guac/blob/main/pkg/ingestor/parser/common/helpers.go to provide the type of dependency.

And also update the topLevel dependency mapping for CDX to be similar to SPDX: https://github.com/guacsec/guac/blob/main/pkg/ingestor/parser/cyclonedx/parser_cyclonedx.go#L209-L216

More information about how cycloneDX handles this can be found on this issue: https://github.com/CycloneDX/specification/issues/33

lumjjb commented 1 year ago

even if it is a not a heuristic a dependency can be indirect or direct, I think until the tooling provides more insight into how the dependency was determined we don't have enough information to tell.

pxp928 commented 1 year ago

Opened this to keep track such that we do not lose track to update. Currently, it seems like sbom4python accurately captures this for SBOMs. We could use that as the benchmark to update. Examples can be found here: https://sbombenchmark.dev/

lumjjb commented 6 months ago

The maintainers had a discussion around this where from https://github.com/guacsec/guac/issues/966#issuecomment-2033265907, that IsDependency knowledge of direct or indirect has a reduced scope in its use. Initially, the use of direct/indirect dependency was to help determine how to traverse the dependency graph globally. However due to moving IsDependency from a global to local predicate (local to HasSBOM), the direct/indirect status of the edge serves mainly as a way to help filter the patch plan.

Therefore, based on the main use, we decided for now that Direct / Indirect should be added when there is indication from the source, and we should not use heuristics to determine that, since the cost of having that information or not for now is low - in the case of patch planning scenarios.