networknt / json-schema-validator

A fast Java JSON schema validator that supports draft V4, V6, V7, V2019-09 and V2020-12
Apache License 2.0
807 stars 320 forks source link

Add options to control caching of schemas #1018

Closed justin-tay closed 3 months ago

justin-tay commented 3 months ago

Closes #1016

This adds additional configuration options to control the caching / preloading schema behavior.

txshtkckr commented 1 month ago

Any chance we could get a release with this option in it? I'm encountering this issue in production use: https://bitbucket.org/atlassian/adf-builder-java/issues/105

On a related note, looking at CachedSupplier, it holds on to its delegate indefinitely. I have not dug into the library enough to be sure if there are thread-safety concerns around making the delegate mutable and null-ing it out after using it, but that would allow it to release all the references captured in the lambda at the call point, which could greatly reduce the memory pressure that this is causing.

In a heap dump that I analyzed, the retained set from the RefValidator was 450M, and even a very sprawled eval path tree shouldn't have this kind of impact. But the supplier captures the validation context, among possibly other things, and the heap I'm looking at shows it having captured 90,000 of those. The impact of that is almost definitely not trivial.

txshtkckr commented 1 month ago

If thread-safety is an issue, an easy workaround would be to change the delegate field to volatile and write back a constant supplier after performing the calculation:

    @Override
    public T get() {
        if (cache == null) {
            T value = delegate.get();
            cache = value;
            delegate = () -> value;
        }
        return cache;
    }

This would release whatever garbage the supplier is holding while still providing the guarantee that the delegate is never null.