Investigate JSONPath as an alternative to JMESPath

bluesky / tiled

API to structured data

https://blueskyproject.io/tiled

BSD 3-Clause "New" or "Revised" License

59 stars 49 forks source link

Investigate JSONPath as an alternative to JMESPath #217

Open danielballan opened 2 years ago

danielballan commented 2 years ago

I learned about JSONPath when I noticed that Kubernetes uses it. It looks like an alternative to JMESPath. I am not sure if it is as formally specified or what other tradeoffs there might be, but its use by Kubernetes provides some basis to believe it will stick around. It would be worth understanding how they compare.

danielballan commented 2 years ago

Interesting arguments that JMESPath is the more robust choice: https://github.com/OAI/OpenAPI-Specification/discussions/2556#discussioncomment-678802

danielballan commented 2 years ago

I think JMESPath is the clear winner. Closing, but open to arguments if anyone wants to revive this in the future.

danielballan commented 1 year ago

Postgres implements JSONPath, and it looks like SQLite followed suit. If we want predicate push-down, that is the language we will need.

However, the points raised in the linked post above stand. JMESPath is much better specified. The question is whether to expose JMESPath in the HTTP API and convert or to expose JSONPath.

danielballan commented 1 year ago

As of September it looks like JSONPath is on track to be formally specified. https://datatracker.ietf.org/wg/jsonpath/about/

danielballan commented 8 months ago

The RFC is dormant, but there is nonetheless an actively maintained Python implementation of the proposal.

https://pypi.org/project/jsonpath-ng/

For use cases where we want to fish just a couple fields (e.g. some columns in a table of search results) out of some sprawling JSON metadata, we can get large performance wins from predicate push-down.

SQLite supports the basics: https://www.sqlite.org/json1.html#jex

And Postgres supports quite a lot, perhaps more than we would need to expose, at least at first: https://www.postgresql.org/docs/current/functions-json.html#FUNCTIONS-SQLJSON-PATH

danielballan commented 5 months ago

Latest thinking on this, per chat with @pbeaucage:

Deprecate select_metadata and our support JMESPath, as JMESPath does too much: too complicated, and maybe exposes too much power to clients to put load on the server
Add select to /search and /metadata, accepting JSONPath. (Adding it to /search is the important one, since the savings on a whole _page of results is the most significant.) The parameter should be typed List[str], as in ?select=start.uid&select=start.scan_id. The returned metadata should always be a flat JSON object, keyed like {"start.uid": ..., "start.scan_id": ...}.