tidwall / gjson

Get JSON values quickly - JSON parser for Go
MIT License
13.88k stars 841 forks source link

Issue with Query #315

Closed conikeec closed 1 year ago

conikeec commented 1 year ago

Hi

I am trying to extract the sibling node value based on a wild card search and not able to get it to work. Wondering if my query syntax is the issue

fn main() {

    let j = r#"{
    "categories": [
        {
            "category": "Sales",
            "skucode": 4567,
            "subCategory": [
                "CRM Software",
                "Sales Compensation Software",
                "AI Sales Assistant Software",
                "Auto Dialer Software",
                "Contract Analytics Software",
                "Contract Lifecycle Management (CLM) Software",
                "Contract Management Software",
                "Customer Revenue Optimization (CRO) Software",
                "E-Signature Software",
                "Field Sales Software",
                "Other Sales Software",
                "Partner Ecosystem Platforms Software",
                "Partner Management Software",
                "Presales Demo Automation Software",
                "PreSales Management Software",
                "Quote : Management CPQ Software",
                "Quote : Management Pricing Software",
                "Quote : Management Proposal Software",
                "Quote : Management Quote-to-Cash Software",
                "Quote : Management Visual Configuration Software",
                "Revenue Operations & Intelligence (RO&I) Software",
                "Sales Acceleration : Conversation Intelligence Software",
                "Sales Acceleration : Digital Sales Room Software",
                "Sales Acceleration : Email Tracking Software",
                "Sales Acceleration : Lead-to-Account Matching and Routing Software",
                "Sales Acceleration : Outbound Call Tracking Software",
                "Sales Acceleration : Sales Coaching Software",
                "Sales Acceleration : Sales Enablement Software",
                "Sales Acceleration : Sales Engagement Software",
                "Sales Acceleration : Salesforce CRM Document Generation Software",
                "Sales Acceleration : Sales Performance Management Software",
                "Sales Acceleration :  Planning Software",
                "Sales Acceleration :  Training and Onboarding Software",
                "Sales Acceleration Platforms",
                "Sales Analytics Software",
                "Sales Gamification Software",
                "Sales Intelligence Software",
                "Sales Platforms Software"
            ]
        }]}"#;

    let search_string = r#"categories.#.subCategory.#(%"CRM*")"#; // works

    // return category value based on wildcard search of subcategory
    let search_string = r#"categories.#.subCategory.#(%"CRM*").category"#; // does not work 

    let result = gjson::get(&j, search_string);
    println!("Result is : {}", result);
volans- commented 1 year ago

@conikeec, to filter on a sibling you have to make a query at a higher level to be able to get the parent object. Queries can be nested.

So in your case you have to query for category items that have a subCategory key which has at least one item starting with the word CRM. And you need to add a # at the end of the outermost query to return all matching items, not just the first. See also [1]

That will return a list of category items that matches your criteria.

Only at thay point you can get the category field of them, resulting in a list of categories.

So with query:

categories.#(subCategory.#(%"CRM*"))#.category

You get:

["Sales"]

I hope that helps.

[1] https://github.com/tidwall/gjson/blob/master/SYNTAX.md#queries

conikeec commented 1 year ago

Thank you very much @volans- . It works. Question: Is there a provision for case-insensitive search

volans- commented 1 year ago

Not that I know of, I don't think there is regex support, judging also from issue #135.

But if the possible variants are limited you could achieve the same results using multipaths [1] combined with the @flatten modifier [2] to get a flat list:

[categories.#(subCategory.#(%"CRM*"))#.category,categories.#(subCategory.#(%"crm*"))#.category].@flatten

This might lead to duplicate results if the items match both queries.

Another alternative could be to create your own custom modifier [3] or do multiple queries and mangle the data in your code.

[1] https://github.com/tidwall/gjson/blob/master/SYNTAX.md#multipaths [2] https://github.com/tidwall/gjson/blob/master/SYNTAX.md#modifiers [3] https://github.com/tidwall/gjson/blob/master/SYNTAX.md#custom-modifiers

volans- commented 1 year ago

@conikeec sorry for the bad formatting, apparently GitHub email reply support is fairly limited, I can't even edit the replies to add markdown from the browser now. I hope it's still understandable.

conikeec commented 1 year ago

Thanks again @volans- . I figured it's more optimal to lowercase the loaded JSON and search string. 😄

Another quick question: I revised the JSON above to add another field skucode .. Using the same query, can I fetch both the siblings - category and skucode , when subCategory matches ?

volans- commented 1 year ago

@conikeec yes, you can leverage again the multipaths, that can be used anywhere in the query and apply to the current object.

So with an input of:

{
    "categories": [
        {
            "category": "Sales",
            "skucode": "A001",
            "subCategory": [
                "CRM Software",
                "Another category"
            ]
        },
        {
            "category": "Excluded",
            "skucode": "A002",
            "subCategory": [
                "NOT CRM Software",
                "Another category"
            ]
        },
        {
            "category": "Communication",
            "skucode": "A003",
            "subCategory": [
                "CRM Software",
                "Another category"
            ]
        }
    ]
}

You can get a list of objects with the query:

categories.#(subCategory.#(%"CRM*"))#.{category,skucode}

that returns:

[{"category":"Sales","skucode":"A001"},{"category":"Communication","skucode":"A003"}]

You can customize the key names with:

categories.#(subCategory.#(%"CRM*"))#.{"cat":category,"sku":skucode}

that returns:

[{"cat":"Sales","sku":"A001"},{"cat":"Communication","sku":"A003"}]

Or just decide to get a list of 2 items lists with:

categories.#(subCategory.#(%"CRM*"))#.[category,skucode]

that returns:

[["Sales","A001"],["Communication","A003"]]
conikeec commented 1 year ago

@volans- appreciate the detailed reply. Again, thank you as this is what i was looking for.

Closing the issue