hiltontj / serde_json_path

Query serde_json Values with JSONPath
https://serdejsonpath.live/
MIT License
51 stars 4 forks source link

Query satisfied with value from unrelated key #49

Closed silverjam closed 1 year ago

silverjam commented 1 year ago

Hello! Thanks for the amazing work on this crate! I'm super excited to be able to use this crate in one of my projects.

In trying to utilize the crate though I think I've found a bug with how queries work. I believe simple queries such as $[?(@.msg_type == 74)] will be "satisfied" by values from unrelated fields.

For example, see the following gist: https://gist.github.com/silverjam/b73c6c8ff3f6f5a20e59ad17e234346f

This runs the following code:

fn main() {
    let val = serde_json::json!({"msg_type": 75, "sender": 74});
    let q = serde_json_path::JsonPath::parse("$[?(@.msg_type == 74)]").unwrap();
    let res = q.query(&val).all();
    assert!(res.is_empty(), "query should be return no results: {res:?}");
}

You'll note that the query finds the 74 value from the sender key even though the query should only be against the msg_type. I'm very new to JSONPath, but I don't think this is intended behavior.

The following modification works though (that is, changing the query to $.msg_type[?(@ == 74)]):

fn main() {
    let val = serde_json::json!({"msg_type": 75, "sender": 74});
    let q = serde_json_path::JsonPath::parse("$.msg_type[?(@ == 74)]").unwrap();
    let res = q.query(&val).all();
    assert!(res.is_empty(), "query should be return no results: {res:?}");
}

I'd love to be able to help fix this bug if you don't have bandwidth, so if you can provide pointers on where to look I'd be happy to help (if I'm able).

silverjam commented 1 year ago

I'm able to work around this issue by wrapping the value in a list, so maybe I'm doing something that JSONPath doesn't expect:

fn main() {
    let val = serde_json::json!([{"msg_type": 75, "sender": 74}]);
    let q = serde_json_path::JsonPath::parse("$[?(@.msg_type == 74)]").unwrap();
    let res = q.query(&val).all();
    assert!(res.is_empty(), "query should be return no results: {res:?}");
}
hiltontj commented 1 year ago

Hey @silverjam - thank you for raising the issue! That certainly looks like a bug and I am able to replicate it in the sandbox. I am open to pull requests; however, I haven't set up a CONTRIBUTING.md yet to help people get started. Either way, I think I can get some time later this week or weekend to dig into this.

As far as my knowledge of JSONPath is concerned, given the input:

{"msg_type": 75, "sender": 74}

The query string $[?(@.msg_type == 74)] should produce an empty node list.

I would argue, given the structure of the JSON, that the query is malformed, but that does not negate the fact that serde_json_path is producing false positives here. I'll try to shed a bit of light on JSONPath and what is going on with the queries you have tried...


In this first case producing the bug, we are applying the filter ?(@.msg_type == 74) to the root node $ of the object. As per the JSONPath Spec, when a filter is applied to an object node, it will test against each object member value. So, the filter will be tested against two nodes here, for which the current node specifier @ will refer to the values 75, and then 74, respectively; in either case, neither is an object with the key msg_type, so the filter should not produce anything.

In your second example, the query $.msg_type[?(@ == 74)] is applying the filter ?(@ == 74) to the node located in the msg_type field of the root object. In this case, that node is the JSON number 75. Since filters only work on nodes that are JSON arrays, or JSON objects, then this will result in an empty node list.

Unfortunately, in the above two examples, the spirit of JSONPath is not to produce helpful error messages or warnings for malformed queries, but to just produce an empty result.

In your final example, when you change the input to an array, i.e.,

[{"msg_type": 75, "sender": 74}]

Then the query works as you might expect. Now, the root node $ is the array, and the filter is being applied against each array element; and indeed, each array element is an object with the key msg_type. So that is re-assuring.

That final case is a more typical use case for JSONPath filters: you have an array of similar objects somewhere in the root node and you want to filter them down based on some criteria. If you want to elaborate more on your particular use case, and why you are using JSONPath, I'd be happy to discuss here.

In the meantime, I will go looking into why it is producing false positives, and see if I can get a fix in.

hiltontj commented 1 year ago

Hey @silverjam - I believe I have the fix in #50. All tests are green, so I should be able to release before the week is out, but feel free to take a look / review if you'd like.

Thank you again for raising the issue, and for putting in the time to share all the details that you did.

hiltontj commented 1 year ago

@silverjam - v0.6.2 was just released with the fix 🍻

silverjam commented 1 year ago

Cheers! 🙌