davishmcclurg / json_schemer

JSON Schema validator. Supports drafts 4, 6, 7, 2019-09, 2020-12, OpenAPI 3.0, and OpenAPI 3.1.
MIT License
388 stars 65 forks source link

Schemas with URI::File references gets overwritten when storing in resources[:lexical|:dynamic] #188

Open kjeldahl opened 1 week ago

kjeldahl commented 1 week ago

It is because of this little fun fact:

URI("file:///C:/Users/Public/schema.json") == URI("file:///C:/Users/Public/schema.json#fragment") #=> true

URI("http://host.domain/file.json") == URI("http://host.domain/file.json#fragment") #=> false

If the schemas are stored using a String version of the URI it seems to be working.

davishmcclurg commented 1 week ago

Damn that is a fun little fact—good find! Using strings instead seems reasonable. We do pass URI objects to ref_resolver, so I'll have to think if there are any implications there (or elsewhere). It's interesting that file URIs are allowed to have a fragment but are still considered equal when they have different ones:

>> URI("file:///C:/Users/Public/schema.json#fragment").fragment
=> "fragment"
>> URI("file:///C:/Users/Public/schema.json").fragment
=> nil
>> URI("file:///C:/Users/Public/schema.json") == URI("file:///C:/Users/Public/schema.json#fragment")
=> true
>> URI("file:///C:/Users/Public/schema.json").fragment == URI("file:///C:/Users/Public/schema.json#fragment").fragment
=> false

Looks like these are the only things considered in equality for file URIs: https://github.com/ruby/ruby/blob/c4baf3b3c0953c8df24a87db2059f831f45f19ab/lib/uri/file.rb#L17-L21

And RFC 8089 doesn't include a fragment: https://datatracker.ietf.org/doc/html/rfc8089

Thanks for opening this! I'll probably work on a fix this week. Let me know if you're interested in contributing.

kjeldahl commented 6 days ago

@davishmcclurg I can open a PR with my changes. Which boils down to just calling to_s on the URI when storing them and when making the lookup. But that is in to different places so when storing a URI is used but when fetching a string is used. So I guess changing the resources method to return an object capable of doing the registration and the look up using URI's in its public interface would be a cleaner way to do it.

kjeldahl commented 6 days ago

I added a check for whether a schema was already registered in Resources#register but it caused a couple of tests to fail. I have added it as code suggestion in a review on the PR so you can check whether it is needed.