anchore / syft

CLI tool and library for generating a Software Bill of Materials from container images and filesystems
Apache License 2.0
6.06k stars 561 forks source link

binary detection: openbsd OpenSSH and portable OpenSSH #3216

Open krysgor opened 3 weeks ago

krysgor commented 3 weeks ago

Hi,

(Not sure if i'm right here, because it's a contributor question and i'm not so familiar with go)

I would like to implement openbsd OpenSSH and portable OpenSSH binary detection with correct cpe's in one classifier.

So openbsd have two OpenSSH products with different cpe's:

I alrady have the regex to match the version \x00OpenSSH_(?P<version>[0-9]+\.[0-9]+)(p[0-9])?\x00 (is also match the optional portable p1 information).

The question ist: how can I build this two different cpe in one classifier? Is it possible to implement this with one classifier? If not I will make simply two classifyers: openssh-binary and openssh-portable-binary.

Thanks

spiffcs commented 3 weeks ago

👋 Thanks for the issue @krysgor and question about classifiers!.

I'll point you to a recent PR that just landed that has a few examples in it: https://github.com/anchore/syft/pull/3078

Note the classifiers being added here: https://github.com/anchore/syft/pull/3078/files#diff-962c0bf8d15912f4f2b27bb43392f4ec0ab0d535cd5848ec6da96e8c251f9017R450-R479

On the Classifier struct there is a field called CPEs. While this PR uses a convenience function for singleCPE that field is flexible to contain multiple CPE so long as Class/Package match is the same.

If you wanted two classifiers that resulted in two different packages you would just use a single cpe and do something like:

        {
            Class:    "openbsd-OpenSSH-binary",
            Package: "openSSH/openbsd",
            CPEs:    singleCPE("cpe:2.3:a:openbsd:openssh:9.6:-:*:*:*:*:*:*"),
        },
        {
            Class:    "portable-OpenSSH",
            Package: "openSSH/portable",
            CPEs:    singleCPE("cpe:2.3:a:openbsd:openssh:9.6:p1:*:*:*:*:*:*"),
        },

The other path is to just do this as one classifier with one package where you add multiple CPE at the bottom of that struct.

Happy to talk more about this or review some code if you've already written something 👍

krysgor commented 2 weeks ago

Hi @spiffcs , I have already made a commit for it, here: 778437f, so your can review it.

It ends up being just the detection of the main version (without the portable binary). The reason for ignoring the portable version is that the portable executables have both version identifiers. For example the output of the make add-snippet command:

Multiple string matches found in the binary:

1)  69432 OpenSSH_9.7p1
2)  78969 OpenSSH_9.7

Please select a match: 

So after implementing the two-classifier solution, syft match both classifiers for the portable binary:

openssh                                                     9.7           binary
openssh                                                     9.7p1         binary

The non-portable binary always looks good:

openssh                                                     9.7           binary

I'm not sure what to do in this situation. But creating an sbom that contains two entries for openssh is (probably) wrong. So I decided to just match the main version of the binary.

wagoodman commented 1 week ago

But creating an sbom that contains two entries for openssh is (probably) wrong

agreed -- mind posting the code for the two regexes that were used in the dual-classifier approach? There might be more options, I think you really need one classifier with multiple evidence matchers, see an example here

https://github.com/anchore/syft/blob/d7005d7d8ca6d05f594f7bc1a140ae1e85bc0328/syft/pkg/cataloger/binary/classifiers.go#L13-L22

This way we would never be finding duplicate packages since there would be one classifier.