networknt / json-schema-validator

A fast Java JSON schema validator that supports draft V4, V6, V7, V2019-09 and V2020-12
Apache License 2.0
836 stars 324 forks source link

Walk in ItemsValidator202012 uses different path than ItemValidator #989

Closed bartoszm closed 6 months ago

bartoszm commented 6 months ago

Just to confirm as maybe this difference is intentional.

When you perform walk with null instance the behavior differs between (1) com.networknt.schema.ItemsValidator (2) com.networknt.schema.ItemsValidator202012

(1) appends 0 to the instance path whereas (2) does not. In addition (2) addends id in case instance is not null and does not otherwise.

Simple fix is to add index 0 com.networknt.schema.ItemsValidator202012:135 either unconditionally or in case node is null.

Please let me know your thoughts on this. I am able to provide fix if you confirm difference in behavior is unintentional.

justin-tay commented 6 months ago

So in general passing null as instance data is not valid since the walk actually walks the instance data, it doesn't walk the schema. So null means that there's nothing to walk. The walk listeners can end up with null as the instance node, typically when processing items or properties, this is more so that they can potentially set the node value. If the null instance ends up being passed to the validation functions this is going to throw as they generally don't handle that scenario.

If you are referring to https://github.com/networknt/json-schema-validator/blob/bc93b4439adcd64b75d8a432c8e9a050c451a43d/src/main/java/com/networknt/schema/ItemsValidator202012.java#L134-L137 then that looks intentional and it's more to handle the case where the node is not an ArrayNode as opposed to handling when it's null. So it could be that ItemsValidator should be changed to make it consistent but I'm hesitant to make any changes without knowing whether the actual use case is valid since you mentioned that the instance is null and not that the ArrayNode doesn't have children.

bartoszm commented 6 months ago

My use case is to walk the schema for different purposes than validation - I know this is not the main use case for your library :)

I have created a minimal use case I was trying to explain above: https://gist.github.com/bartoszm/5bdcebe01c59742ad40165c1a78a8966

What the use case demonstrates is how the walk works when data instance is provided and when data is not provided. As you can see there is a glitch in the Matrix when you use V202012.

The result for VersionFlag.V201909: -- with data -- $.name $.children[0].name $.children[0] $.children -- with no data -- $.name $.children[0].name $.children[0] $.children

The result for VersionFlag.V202012: -- with data -- $.name $.children[0].name $.children[0] $.children -- with no data -- $.name $.children.name $.children $.children

justin-tay commented 6 months ago

As I highlighted, the walk is walking the instance data, it doesn't walk the schema, so passing null and expecting it to walk the schema just wouldn't work. For instance a ref could make a schema cyclic, and also cases where a tuple schema is used eg. with prefixItems. It only appears to work due to your specific schema and data by coincidence. There's also discrepancies in behavior when passing null and with validation turned on to walk for instance.

If there's anything to fix it would be to make the V202012 processing to be the same as the processing with V201909. Would this solve your issue, because it sounds like what you want is the opposite.

bartoszm commented 6 months ago

@justin-tay having the V202012 processing to be the same as the processing with V201909 would solve my issue.