CycloneDX / cyclonedx-core-java

CycloneDX SBOM Model and Utils for Creating and Validating BOMs
https://cyclonedx.org/
Apache License 2.0
81 stars 59 forks source link

Bump com.networknt:json-schema-validator to 1.4.0 #393

Closed YanWittmann closed 4 months ago

YanWittmann commented 5 months ago

https://github.com/CycloneDX/cyclonedx-core-java/blob/a45240713b751d80d53e6298aa832285a4aba566/pom.xml#L186

It would be nice if the dependency to com.networknt:json-schema-validator:1.0.73 as found in the pom.xml could be upgraded to the latest 1.4.0, as one of our projects depends on features in this latest version and therefore conflicts with the dependency of this project.

This Pull Request implements the changes needed to use the latest version of the json schema validator. More specifically, it uses the new JSON Schema Mapper format for creating the JsonSchemaFactory instance:

final MapSchemaMapper offlineSchemaMapper = new MapSchemaMapper(offlineMappings);
JsonSchemaFactory factory = JsonSchemaFactory.builder(JsonSchemaFactory.getInstance(SpecVersionDetector.detect(schemaNode)))
  .jsonMapper(mapper)
  .schemaMappers(s -> s.add(offlineSchemaMapper))
  .build();

As can be seen in https://github.com/CycloneDX/cyclonedx-core-java/compare/master...YanWittmann:cyclonedx-core-java-bump-validator:master#diff-fc28290f55c35403fc95b45ee2714337621f54b48caba5e01f08d5760b54139a

Also see all the rejected PRs by dependabot to bump the version, most likely rejected because they introduced that new schema mapper which breaks the current usage.

YanWittmann commented 5 months ago

Unrelated, but why do you require the description to be wrapped at 72 characters? Just asking to understand.

https://github.com/CycloneDX/.github/blob/0b572a6f8fc922c7ecc268dacb2c2ef33218838f/CONTRIBUTING.md

mr-zepol commented 5 months ago

hey, @YanWittmann this is something we were checking, did you notice something about response time when testing? after testing we saw that after this change the unit test for the JSON validator schema went from 5 to 120 seconds, we are investigating what could be the reason

YanWittmann commented 5 months ago

I'm unsure how I didn't notice this before, but you're right. It takes much longer than before to build the project. I'm luckily pretty sure that I know what the issue is.

Please view the new implementation of the newJsonSchema method as internally called by the factory.getSchema method in the CycloneDxSchema.java class:

    protected JsonSchema newJsonSchema(final SchemaLocation schemaUri, final JsonNode schemaNode, final SchemaValidatorsConfig config) {
        final ValidationContext validationContext = createValidationContext(schemaNode, config);
        JsonSchema jsonSchema = doCreate(validationContext, getSchemaLocation(schemaUri),
                new JsonNodePath(validationContext.getConfig().getPathType()), schemaNode, null, false);
        if (config.isPreloadJsonSchema()) {
            try {
                /*
                 * Attempt to preload and resolve $refs for performance.
                 */
                jsonSchema.initializeValidators();
            } catch (Exception e) {
                /*
                 * Do nothing here to allow the schema to be cached even if the remote $ref
                 * cannot be resolved at this time. If the developer wants to ensure that all
                 * remote $refs are currently resolvable they need to call initializeValidators
                 * themselves.
                 */
            }
        }
        return jsonSchema;
    }

The if (config.isPreloadJsonSchema()) { ... } is new compared to the old version of the networknt library. Meaning, each time a new parser is created (each time a document is parsed) all of the related JSON Schemas are checked and loaded by default, since the default value of the preloadJsonSchema parameter is true. This setting did not exist before.

By now simply adding a

config.setPreloadJsonSchema(false);

to the class I edited, this issue is resolved; all tests still pass (as is to be expected) and the runtime is once again down to 2.5 seconds for the test cases on my machine for both versions.

I've pushed this change to my PR.

Thanks for taking the time to review this!

mr-zepol commented 4 months ago

I'm unsure how I didn't notice this before, but you're right. It takes much longer than before to build the project. I'm luckily pretty sure that I know what the issue is.

Please view the new implementation of the newJsonSchema method as internally called by the factory.getSchema method in the CycloneDxSchema.java class:

    protected JsonSchema newJsonSchema(final SchemaLocation schemaUri, final JsonNode schemaNode, final SchemaValidatorsConfig config) {
        final ValidationContext validationContext = createValidationContext(schemaNode, config);
        JsonSchema jsonSchema = doCreate(validationContext, getSchemaLocation(schemaUri),
                new JsonNodePath(validationContext.getConfig().getPathType()), schemaNode, null, false);
        if (config.isPreloadJsonSchema()) {
            try {
                /*
                 * Attempt to preload and resolve $refs for performance.
                 */
                jsonSchema.initializeValidators();
            } catch (Exception e) {
                /*
                 * Do nothing here to allow the schema to be cached even if the remote $ref
                 * cannot be resolved at this time. If the developer wants to ensure that all
                 * remote $refs are currently resolvable they need to call initializeValidators
                 * themselves.
                 */
            }
        }
        return jsonSchema;
    }

The if (config.isPreloadJsonSchema()) { ... } is new compared to the old version of the networknt library. Meaning, each time a new parser is created (each time a document is parsed) all of the related JSON Schemas are checked and loaded by default, since the default value of the preloadJsonSchema parameter is true. This setting did not exist before.

By now simply adding a

config.setPreloadJsonSchema(false);

to the class I edited, this issue is resolved; all tests still pass (as is to be expected) and the runtime is once again down to 2.5 seconds for the test cases on my machine for both versions.

I've pushed this change to my PR.

Thanks for taking the time to review this!

Hey @YanWittmann thanks for looking into it, I just checked, and indeed the test time is almost the same.

mr-zepol commented 4 months ago

+1 LGTM, thanks for contributing to the project