Festo-se / cyclonedx-editor-validator

Tool for creating, modifying and validating CycloneDX SBOMs.
https://festo-se.github.io/cyclonedx-editor-validator/
GNU General Public License v3.0
18 stars 4 forks source link

feat: rework validation schema options #216

Closed mmarseu closed 2 months ago

mmarseu commented 3 months ago

The mutual exclusivity implemented through argparse's add_mutually_exclusive_group() method has a subtle and undocumented bug (or call it an unexpected behavior, if you like) that affects options with default values. If an option with a default value is part of such a group, the option doesn't count as "present on the command-line" if the user explicitly passed its default value.

So in our case, --schema-type and --schema-path were meant to be mutually exclusive but --schema-type has a default value of default. So, this invocation correctly raises an error:

cdx-ev validate --schema-type custom --schema-path myschema.json bom.json

but this invocation does not:

cdx-ev validate --schema-type default --schema-path myschema.json bom.json

because for options with default values, argparse can't tell the difference between

This PR fixes this minor problem by forgoing argparse's built-in default mechanism and replacing it with a hand-coded fallback to the default value in the invoke_validate() function.

It includes a few other minor housekeeping changes:

github-actions[bot] commented 3 months ago

Coverage

Coverage Report •
FileStmtsMissCoverMissing
__main__.py3212093%217–218, 235, 245, 655–656, 660–665, 667, 670, 680–683, 687, 854
error.py230100% 
validator
   helper.py43295%36, 38
   validate.py94792%35, 56, 75–76, 115, 131, 166
TOTAL16979694% 

Tests Skipped Failures Errors Time
299 2 :zzz: 0 :x: 0 :fire: 5.860s :stopwatch:
mmarseu commented 2 months ago

@mmarseu some observations from my side.

Further: If I provide a JSON-file, e.g. an SBOM as schema-path the program crashes with: image

Maybe a sanity check, e.g. with checking if something like $schema": "http://json-schema.org/draft-07/schema#" is within the provided schema (without the specific schema, so only json-schema.org), is a good idea here?

This must be what you meant when you said you found an error that has always been there. I was dumb enough to promise I'd fix it 🙄 Anyways, it's done. User-provided schema are now validated using the proper jsonschema function.

For some reason I started getting a whole lot of mypy errors in validate.py unrelated to my changes so I had to fix those, too. There is a single new mypy ignore comment which is due to an error in the types-jsonschema package. If I can get https://github.com/python/typeshed/pull/12484 merged, we can remove it.

mmarseu commented 2 months ago

@mmarseu after some testing, I found two "bugs", where I think this one is the most critical one: if the JSON-Schema is e.g. an SBOM without dependencies-array or just an empty JSON, i.e. {}, the validation is successful as jsonschema fallbacks to Draft7Validator if no $schema is found in the provided schema.

But as this is something for the lib itself,

That's not even a problem of the library, that's just the way JSON Schema is specified. A completely empty JSON object is a valid schema according to the spec. Also, there are no restrictions on additional properties, you can put anything in a schema (probably to support custom extensions). So, basically, almost any JSON object could be a valid schema. The only exception are properties with the same names as something found in the JSON Schema meta schema but which don't validate against it — for instance a field named dependencies which exists in both JSON Schema and CycloneDX.

I approve the MR. The only thing I could think of: Add a disclaimer in the documentation to be aware of this behavior. But I won't insist on this one, so I leave the decision to you: Merge or change - I can approve it again 😉

MERGE!!!

Just for documentation purposes: The other bug was that the check for the mutual exclusive usage of schema-type and schema-path only works if a valid choice for schema-type is provided, otherwise an error is thrown that schema-type is not one of (default, custom, strict). But this is an "issue" with argparse.

argparse is far from perfect. That may be of the areas where it could be slightly improved. On the other hand, I imagine it could be very hard to separate between multiple errors in the CLI where each one is legitimate in their own right and simple follow-up errors where you really only have to fix the first and the rest falls into place.