CycloneDX / cyclonedx-go

Go library to consume and produce CycloneDX Software Bill of Materials (SBOM)
https://cyclonedx.org/
Apache License 2.0
66 stars 26 forks source link

[Discussion] Use git submodule to fetch valid and invalid CycloneDX BOM tests files from the `github.com/cyclonedx/specification` repo #187

Open Nicolas-Peiffer opened 4 days ago

Nicolas-Peiffer commented 4 days ago

As of today, I noticed cyclonedx-go/testdata only provides valid-* sample test BOM files in XML and JSON. It provides no invalid-* files, and no protobuf files.

$ git clone https://github.com/CycloneDX/cyclonedx-go.git
$ cd cyclonedx-go/testdata
$ broot --cmd ":pt"

cyclonedx/cyclonedx-go/testdata
 ├──snapshots …
 ├──valid-annotation.json
 ├──valid-annotation.xml
 ├──valid-assembly.json
 ├──valid-assembly.xml
 ├──valid-bom.json
 ├──valid-bom.xml
 ├──valid-component-hashes.json
 ├──valid-component-hashes.xml
 ├──valid-component-omniborId.json
 ├──valid-component-omniborId.xml
 ├──valid-component-ref.json
 ├──valid-component-ref.xml
[...]

And files are not sorted by CycloneDX specification version, which makes it harder to list and to maintain.

But I noticed the github.com/CycloneDX/specification repository provides both valid-* and invalid-* BOM sample files in XML, JSON and PROTOBUF formats. These files are useful to tests implementation of the CycloneDX, such as github.com/CycloneDX/cyclonedx-go.

$ git clone https://github.com/CycloneDX/specification.git
$ cd specification/tools/src/test/resources
$ broot --cmd ":pt"    # https://dystroy.org/broot/documentation/usage/#export-a-tree

cyclonedx/specification/tools/src/test/resources
 ├──1.0
 │  └──2 unlisted
 ├──1.1
 │  ├──invalid-component-ref-1.1.xml
 │  ├──invalid-component-type-1.1.xml
 │  ├──invalid-empty-component-1.1.xml
 │  ├──invalid-hash-alg-1.1.xml
 │  ├──invalid-hash-md5-1.1.xml
 │  ├──valid-bom-1.1.xml
 │  └──23 unlisted
 ├──1.2
 │  ├──invalid-bomformat-1.2.json
 │  ├──invalid-component-ref-1.2.json
 │  ├──invalid-component-ref-1.2.xml
 │  ├──invalid-component-swid-1.2.json
 │  ├──invalid-component-swid-1.2.xml
 │  ├──valid-assembly-1.2.json
 │  ├──valid-assembly-1.2.xml
 │  └──82 unlisted
 ├──1.3
 │  ├──invalid-bomformat-1.3.json
 │  ├──invalid-component-ref-1.3.json
 │  ├──invalid-component-ref-1.3.xml
 │  ├──invalid-component-swid-1.3.json
 │  ├──valid-assembly-1.3.json
 │  ├──valid-assembly-1.3.xml
 │  └──122 unlisted
 ├──1.4
 │  ├──invalid-bomformat-1.4.json
 │  ├──invalid-component-ref-1.4.json
 │  ├──invalid-component-ref-1.4.xml
 │  ├──invalid-component-swid-1.4.json
 │  ├──valid-assembly-1.4.json
 │  └──129 unlisted
 ├──1.5
 │  ├──invalid-bomformat-1.5.json
 │  ├──invalid-component-ref-1.5.json
 │  ├──invalid-component-ref-1.5.xml
 │  ├──invalid-component-swid-1.5.json
 │  ├──valid-annotation-1.5.json
 │  ├──valid-annotation-1.5.xml
 │  ├──valid-annotation-1.5.textproto
 │  └──149 unlisted
 ├──1.6
 │  ├──invalid-bomformat-1.6.json
 │  ├──invalid-component-ref-1.6.json
 │  ├──invalid-component-ref-1.6.xml
 │  ├──invalid-component-swid-1.6.json
 │  ├──valid-component-hashes-1.6.json
 │  ├──valid-component-hashes-1.6.xml
 │  ├──valid-component-ref-1.6.textproto
 │  └──178 unlisted
 └──ext
    ├──invalid-depgraph-1.0.xml
    ├──valid-component-depgraph-1.0.xml
    └──valid-depgraph-1.0.xml

I suggest using a git submodule to fetch tests files from the CycloneDX specification repository.

git clone https://github.com/CycloneDX/cyclonedx-go
cd cyclonedx-go
git submodule add -f https://github.com/CycloneDX/specification specification

This would give access to the full list of valid-* and invalid-* BOMs files in *.xml, *.json and *.textproto with no efforts. This would be available at this path: cyclonedx-go/specification/tools/src/test/resources.

And if there are tests files dedicated to one particular language (Go, python, or Java), then either keep a folder for custom test files, or contribute to the tests files in the CycloneDX/specification repo.

All CycloneDX Implementation Python, Golang, Java could do the same

The same principle could be applied to any CycloneDX implementation:

                    ┌────────────────────────────────────────────────────┐
                    │  github.com/CycloneDX/specification commit 59e7d88 │
                    │      └── tools/src/test/resources (local)          │
                    └─────┬─────────────────┬───────────────────┬────────┘
                          │                 │                   │
                 fetch submodule    fetch submodule      fetch submodule
                          │                 │                   │
 ┌────────────────────────▼───────────────┐ │                   │
 │ github.com/CycloneDX/cyclonedx-go      │ │                   │
 │ └── specification @59e7d88 (submodule) │ │                   │
 └────────────────────────────────────────┘ │                   │
                                            │                   │
                  ┌─────────────────────────▼────────────────┐  │
                  │ github.com/CycloneDX/cyclonedx-core-java │  │
                  │  └── specification @59e7d88 (submodule)  │  │
                  └──────────────────────────────────────────┘  │
                                                                │
                                ┌───────────────────────────────▼──────────┐
                                │github.com/CycloneDX/cyclonedx-python-lib │
                                │   └── specification @59e7d88 (submodule) │
                                └──────────────────────────────────────────┘

Use the git submodule for Schema Files as well?

Once the git submodules for CycloneDX/specification are available in both cyclonedx-go, cyclonedx-core-java and cyclonedx-python-lib to fetch test files, this could also be used for fetching schema files (JSON Schema, XSD and protobuf).

This would prevent issues such as this one where I noticed CycloneDX schema files are different from one CycloneDX implementation to another. https://github.com/CycloneDX/specification/pull/479#issuecomment-2180940956

Note: I start this discussion for the cyclonedx-go project. But we could open the same discussion in others CycloneDX implementations (python and java).

CC @jkowalleck & @nscuro who helped me figure out how to explain this idea :smile:

nscuro commented 4 days ago

@Nicolas-Peiffer Do you know how well Git submodules play with Go modules? For example, consider we want to embed schema files from the submodule via //go:embed. Will consumers of cyclonedx-go still be able to use the library (because Go automatically clones submodules), or will this cause issues (because Go doesn't automatically clone submodules)?

Nicolas-Peiffer commented 4 days ago

@nscuro you make a very good point that I completely missed. Thank you very much for this :+1: .

Indeed, after a quick search with ChatGPT4 and the web, I came to the conclusion that GoLang packages on Github.com does not support git submodules natively.

There is an open issue since 2016 somehow similar to this topic: https://github.com/golang/go/issues/17522

And the workarounds based on git clone --recurse-submodules [...] are not satisfactory, since they are not supported by go install and go get.

Fetching files and commit them instead of using a submodule?

I think the only viable short term solution (outside of patching GoLang to add support for git submodules) it to mimic what is already done on the python implementation:

What do you think?