test: PURL generation for language parsers

intel / cve-bin-tool

The CVE Binary Tool helps you determine if your system includes known vulnerabilities. You can scan binaries for over 200 common, vulnerable components (openssl, libpng, libxml2, expat and others), or if you know the components used, you can get a list of known vulnerabilities associated with an SBOM or a list of components and versions.

https://cve-bin-tool.readthedocs.io/en/latest/

GNU General Public License v3.0

1.15k stars 446 forks source link

test: PURL generation for language parsers #3961

Open terriko opened 4 months ago

terriko commented 4 months ago

We're currently in the process of adding PURL generation to our language parsers, but currently aren't using it for anything. Eventually we will as part of the planned gsoc project described in #3771 . But while we're waiting for bigger things, we could definitely stand to have some unit tests!

To avoid wasting time parsing files twice, we may want to add the generate_purl tests into the existingtest_language_package code, and do really basic unit-tests where we throw both valid and invalid vendor, package, version info into generate_purl

joydeep049 commented 4 months ago

I would like to work on this as I'm already working on PURL generation. Would be a good learning experience. @terriko @anthonyharrison

joydeep049 commented 4 months ago

Also, the generate_purl test would be aimed at testing the generate_purl function of the language parsers, or the basic version written in __init__.py? Would we take normalization into consideration when we are testing the function? @terriko @anthonyharrison

joydeep049 commented 3 months ago

Eagerly waiting for response @terriko :)

terriko commented 3 months ago

The goal is always for us to get as close as possible to 100% test coverage, so definitely both. If you've never worked with code coverage tools like codecov, they can help you figure out what parts your tests cover and which parts are not covered.

terriko commented 3 months ago

Note that I don't actually expect us to get to 100% over all of cve-bin-tool because it is asymptotically hard, some of the code has to be tested against real data, and some of the tests would take a long time, so we aim to hover around 80% mostly as a balance of coverage vs practicality. But for something like this where you're literally just making a string and it'll execute in a few microseconds, there's no reason not to test Every Possible Code Path, unless of course upon testing it you realize that a lot of them could be collapsed into a single code path so you don't have to write as many tests. Refactoring is a valid way to improve code coverage too. 📈