inkandswitch / cambria-project

Schema evolution with bi-directional lenses.
MIT License
611 stars 24 forks source link

Support for subschemas and references #5

Open jgaehring opened 3 years ago

jgaehring commented 3 years ago

I ran into some errors trying to update a schema that had subschemas using the allOf keyword. The schema I was trying to update can be seen here.

I've implemented a naive fix that strictly addresses this problem for my use case, but then I run into the problem that $ref's are not yet supported. I dug around a bit and found a mention of this issue in https://github.com/inkandswitch/cambria/pull/1#discussion_r458851944, and I assume this is still the status, as @geoffreylitt mentioned there:

We could start adding support for refs everywhere, but this seems like the right moment to pause and consider Peter's suggestion of finding a better way of working with json schemas -- either find a helper lib or roll our own functions to help navigate a schema.

So I guess I'm wondering if there have been any further decisions or discussion about how this will move forward, and whether contributions would be considered. If so, I'd be happy to lend a hand.

For my own use case, it's fairly trivial to just pull out the only allOf schema that's really significant (frankly, it's annoying it's there at all), and then "dereference" my schema definitions as normal properties. I'm already doing so in my demo repo. So I have a workaround. But I figured it could be useful to document here, and I appreciate any updates if and when there is progress on these features.

geoffreylitt commented 3 years ago

Thanks for the report! I think there may be some fundamental issues here with using JSON Schema as a schema definition language. We chose it because it was convenient to use an existing standard with tooling, but I'm not sure the needs always align with what we need in Cambria.

I think your proposed fix seems reasonable, especially since it's similar to how we handle anyOf. But what about a case where multiple subschemas match because they contain overlapping parts? Might it make sense to update more than one? I'm not sure there's really an obvious answer.

As a short term thing, I'm open to adding more functionality that handles these cases better, and hopefully we can also do a better job documenting too. One tool that seems potentially very helpful is this library from cloudflare, which handles 1) dereferencing refs, 2) collapsing allOf into a single schema. That seems like it might take care of your immediate issues -- would you have any interest in trying out that tooling? It seems like potentially you could just preprocess the schema through that, and then use cambria as-is.

Bigger picture, I think it could be reasonable to designate a subset of JSON Schema supported by Cambria. I think the core data definition parts are good, and features like refs probably make sense, but not sure that allOf is really a feature that makes sense for this use case.

jgaehring commented 3 years ago

I think there may be some fundamental issues here with using JSON Schema as a schema definition language. We chose it because it was convenient to use an existing standard with tooling, but I'm not sure the needs always align with what we need in Cambria.

Ah, good to know. So I take it you're much more interested in deriving type declarations for TypeScript? For us, JSON Schema is a core requirement, because it's how we determine the schemas supported by the server, and we need that in some serialized format that can be determined at runtime. Although I'd be open to exploring alternatives to JSON Schema.

But what about a case where multiple subschemas match because they contain overlapping parts? Might it make sense to update more than one? I'm not sure there's really an obvious answer.

Exactly, there's a lot of complexity I haven't even begun to touch. Which is why I'm not sure I want to pursue this "naive" approach much further than I have so far.

It seems like potentially you could just preprocess the schema through that, and then use cambria as-is.

Oh great, that looks like a promising library. And yea, I think it could make most sense to preprocess the schemas like you say. I think we can safely make some assumptions that should keep it simple enough, since like I said, there's really just one schema in that allOf block that we really need to care about (again, it's so annoying it's there in the first place, but probably not worth fixing upstream). And as for the $ref's I can definitely try out that library and report back on my findings, if it's something you might want to incorporate back into Cambria.

As for bigger picture, that all sounds good to me. Good to know it's probably not in scope for Cambria to provide full support for the entire JSON Schema spec, and I think we can work around that.