hiltontj / serde_json_path

Query serde_json Values with JSONPath
https://serdejsonpath.live/
MIT License
50 stars 3 forks source link

Refine the `NodeList` API #11

Closed hiltontj closed 1 year ago

hiltontj commented 1 year ago

The NodeList type's API could use some refinement.

Currently, NodeList has two methods: one() and all(), whose APIs are, respectively:

fn one(self) -> Option<&'a Value>
fn all(self) -> Vec<&'a Value>

I'll discuss some ideas for each below. This has nothing to do with The Three Musketeers.

Fallibility of one()

Currently, one() will return Some i.f.f. the NodeList contains a single value.

This was loosely inspired by sqlx's API (see the fetch_one and fetch_optional methods in the sqlx docs).

Suggested Change

In the spirit of being explicit, I suggest altering the API to the following set of methods:

fn one(self) -> Result<&'a Value, serde_json_path::Error>
fn one_opt(self) -> Option<&'a Value>

In other words, making one() fallible enforces the notion that the caller expects their query to return a single node, whereas, one_opt() would provide a more lenient interface. Similar to the current one(), they would produce Result::Ok and Option::Some, respectively, i.f.f. there is a single node in the NodeList.

For the fallible one(), variants could be added to serde_json_path::Error to express both failure cases: when no nodes were returned by the query, or when more than one node was returned by the query.

The need for all()

I'm questioning the need for all(). In sqlx's case, having each fetch_* method makes sense, in part, because those methods are also invoking the actual asynchronous query execution. In serde_json_path's case, the query is already done, and the NodeList is already populated, so all() is just giving access to the inner Vec.

This is convenient, as we get all the functionality and trait implementations of a Vec<&'a serde_json::Value>. But all of that convenience could be implemented directly on NodeList, and eliminate the need for the all() method.

I think the need still exists for one() and one_opt(), as they represent an additional processing step on the query result.

hiltontj commented 1 year ago

This was originally brought up in #9

Dav1dde commented 1 year ago

For the fallible one(), variants could be added to serde_json_path::Error to express both failure cases: when no nodes were returned by the query, or when more than one node was returned by the query.

I feel like this should be a separate Error, something like:

enum SomeError {
    Empty,
    Multiple,
}

Maybe relevant, itertools has an exactly_one: https://docs.rs/itertools/latest/itertools/trait.Itertools.html#method.exactly_one (and at most one).

I'm questioning the need for all(). In sqlx's case, having each fetch_* method makes sense, in part, because those methods are also invoking the actual asynchronous query execution. In serde_json_path's case, the query is already done, and the NodeList is already populated, so all() is just giving access to the inner Vec.

If there is a .one(), I think an .all() makes sense, simply because since there already is a wrapper type (NodeList) and that would be the logical opposite to .one().

Random nit: .all() should probably return a slice not a &Vec.

I don't think copying the interface of a Vec to a NodeList is a good idea, but alternatively maybe a impl From<NodeList> for Vec and IntoIterator makes sense.

There is also possibility of implementing Deref and DerefMut for NodeList and deref to Vec, but this is often considered bad practice.

hiltontj commented 1 year ago

I feel like this should be a separate Error

This would be good separation of concerns, i.e., separating this from parsing-related errors, which should be expanded in future.

Maybe relevant, itertools has an exactly_one: https://docs.rs/itertools/latest/itertools/trait.Itertools.html#method.exactly_one (and at most one).

An excellent example, and worth copying.

Random nit: .all() should probably return a slice not a &Vec.

This is a great point, and I think is correct.

I don't think copying the interface of a Vec to a NodeList is a good idea, but alternatively maybe a impl From for Vec and IntoIterator makes sense.

Indeed. NodeList already has:

impl<'a> IntoIterator for NodeList<'a> {
    type Item = &'a Value;
    /* ... */
}

As well as a impl From<Vec<&Value>> for NodeList. An impl From<NodeList> for Vec<&Value> could be useful as well.

hiltontj commented 1 year ago

Random nit: .all() should probably return a slice not a &Vec.

This may not be possible. The NodeList is meant to be ephemeral. It holds onto a Vec<&'a Value>, where 'a is the lifetime of the original Value that was queried.

The current API is this:

impl<'a> NodeList<'a> {
    /* ... */
    pub fn all(self) -> Vec<&'a Value> {
        self.nodes
    }
    /* .. */
}

It consumes self to get the inner Vec. To return a slice, I think the function would need to change to (perhaps there is an alternative?):

pub fn all(&self) -> &[&Value] {
    &self.nodes // or self.nodes.as_slice()
}

This presents a challenge, because, the following will then not compile:

let value = json!({"foo": [1, 2, 3]});
let path = JsonPath::parse("$.foo.*")?;
let nodes = path.query(&value).all();

Due to the error:

error[E0716]: temporary value dropped while borrowed
   |
   |     let nodes = path.query(&value).all();
   |                 ^^^^^^^^^^^^^^^^^^      - temporary value is freed at the end of this statement
   |                 |
   |                 creates a temporary value which is freed while still in use

In other words, you need to do:

let value = json!({"foo": [1, 2, 3]});
let path = JsonPath::parse("$.foo.*")?;
let query = path.query(&value);
let nodes = query.all();

Unless there is a case for having the query result live beyond the access, i.e., the call to all(), exactly_one(), etc., then my preference is to keep returning a Vec.

Dav1dde commented 1 year ago

This may not be possible. The NodeList is meant to be ephemeral. It holds onto a Vec<&'a Value>, where 'a is the lifetime of the original Value that was queried.

Ah, my bad, I misread the signature and thought it returns a &Vec. This also makes the mentioned From impl less important.

hiltontj commented 1 year ago

This is addressed in #13

hiltontj commented 1 year ago

This should be settled with the 0.5.1 release.