Stranger6667 / jsonschema-rs

JSON Schema validation library
https://docs.rs/jsonschema
MIT License
511 stars 91 forks source link

Custom Validators #379

Closed tamasfe closed 5 months ago

tamasfe commented 2 years ago

I got a feature request in a project that uses the library to validate file paths in a document. It would make sense to simply add a validator that would be able to do this, but it's currently not possible. The closest I can get is format validators but they lack any context.

Would it be desirable to expose a way to hook into the compilation process and add entirely custom validators?

This might not be a good idea because:

However it would be very convenient for my use-case.

If this is desirable I imagine in order to do this we'd need to:

samgqroberts commented 1 year ago

I'd just like to add on to this my own use case, as additional information about how people are using the project and working around this particular issue (+ #245 ).

I use the Python binding, but I have a custom string format called "currency" which means I can't use this library for all of my validation. The "currency" format requires that a string has some amount of digit characters, followed by a period, followed by exactly two digit characters. It is meant to work around the fact that JSON does not have a fixed-point decimal type, so currency figures are stored in strings to maintain precision / avoid floating point nonsense.

I have a wrapper function around the validation that first checks whether the provided schema contains (recursively) any "currency" format:

def detect_currency_format(schema: Any) -> bool:
    if isinstance(schema, list):
        return any([detect_currency_format(x) for x in schema])
    if isinstance(schema, dict):
        if schema.get("format") == "currency":
            return True
        return any([detect_currency_format(v) for v in schema.values()])
    return False

If the schema does contain the format, I know I need to use another library to validate that entire schema. The other library I use is jsonschema. There I can add custom formats, which I do for "currency", and provide a lambda for the validation which just calls out to a regex.

Furthermore, the other library allows for overriding keyword validation in general. I use that to override default behavior for the numeric modifiers "multipleOf", "minimum", "maximum", "exclusiveMinimum", and "exclusiveMaximum" so that they apply to currency strings as if they were any other numeric type. From the perspective of designing the solution to this issue I wanted to show via my use case that not only would custom validation on new keywords be useful, but so would overriding behavior on existing keywords (to me, at least).

So, in effect, I only really use jsonschema-rs to improve performance of my type-checking where I can, which is for any schema that doesn't have a "currency" string. It would take bringing custom format validation to the Python binding ( #245 ) as well as custom validation / keyword behavior overriding (this issue) to allow me to get off my dependency on the much, much slower Python-only jsonschema library. Or convincing my team to implement this particular module in Rust, which would obviate my need for #245, but still require this issue.

Let me just make it clear, too, the performance benefits of plugging in jsonschema-rs when I can is definitely worth it :). Love this library to death, and thank you @Stranger6667 !