tidwall / gjson

Get JSON values quickly - JSON parser for Go
MIT License
14.12k stars 847 forks source link

Join objects in nested array with another field #169

Open hackery opened 4 years ago

hackery commented 4 years ago

I've been struggling with @join, and multipaths, and not sure if either can do this: from the Murphys example, emit:

[
  { "movie": "Deer Hunter", "friend": "Dale" },
  { "movie": "Deer Hunter", "friend": "Roger" },
  { "movie": "Deer Hunter", "friend": "Jane" }
]

In my case, I have a timestamp and other metadata at top level, and want to fold those into a series of results. This is via Telegraf's json parser so it would have to be just a query string rather than coding to the API.

Is this possible (maybe it requires backtracking) ?

volans- commented 2 years ago

I did play a bit with I think it might not be possible, but is an interesting case in my opinion.

Given that with {"movie":[fav\.movie],"friend":friends.#.first}.@group is possible to get (indentation is mine):

[
    {"movie":"Deer Hunter","friend":"Dale"},
    {"friend":"Roger"},
    {"friend":"Jane"}
]

I was wondering if maybe it would be possible to support the case that if an item of the objects to group is not an array it gets copied for all items to group, so basically I was wondering if supporting {"movie":fav\.movie,"friend":friends.#.first}.@group (without the square brackets around fav\.movie) could actually generate the output requested here.

I'm not that familiar with GJSON internals though, so this might not be an option or could conflict other existing syntax features.

tidwall commented 2 years ago

Sadly, I don't know of any easy way without adding a new modifier.

Here's one I named @fill, which fills the last value in the array (which must be an array of object) with the data from the previous values.

[{movie:fav\.movie},friends].@fill
[
  {"movie": "Deer Hunter", "first": "Dale", "last": "Murphy", "age": 44, "nets": ["ig", "fb", "tw"]},
  {"movie": "Deer Hunter", "first": "Roger", "last": "Craig", "age": 68, "nets": ["fb", "tw"]},
  {"movie": "Deer Hunter", "first": "Jane","last": "Murphy", "age": 47, "nets": ["ig", "tw"]}
]

Here's the modifier:

gjson.AddModifier("fill", func(json, arg string) string {
    arr := gjson.Parse(json)
    if !arr.IsArray() {
        return ""
    }
    var last gjson.Result
    items := make(map[string][2]gjson.Result)
    arr.ForEach(func(_, value gjson.Result) bool {
        if value.IsObject() {
            value.ForEach(func(key, value gjson.Result) bool {
                items[key.String()] = [2]gjson.Result{key, value}
                return true
            })
        } else {
            last = value
        }
        return true
    })
    if !last.IsArray() {
        return ""
    }
    oitems := make([][2]gjson.Result, 0, len(items))
    for _, item := range items {
        oitems = append(oitems, item)
    }
    sort.Slice(oitems, func(i, j int) bool {
        return oitems[i][0].Less(oitems[j][0], false)
    })
    var out []byte
    out = append(out, '[')
    var i int
    last.ForEach(func(_, value gjson.Result) bool {
        if !value.IsObject() {
            // Skip non-objects
            return true
        }
        if i > 0 {
            out = append(out, ',')
        }
        out = append(out, '{')
        var j int
        for _, item := range oitems {
            var found bool
            // Only replace values that do not already exist.
            value.ForEach(func(key, _ gjson.Result) bool {
                if key.String() == item[0].String() {
                    found = true
                    return false
                }
                return true
            })
            if found {
                continue
            }
            if j > 0 {
                out = append(out, ',')
            }
            out = append(out, item[0].Raw...)
            out = append(out, ':')
            out = append(out, item[1].Raw...)
            j++
        }
        suffix := strings.TrimSpace(value.Raw[1:])
        if len(suffix) > 0 && suffix[0] != '}' && j > 0 {
            out = append(out, ',')
        }
        out = append(out, suffix...)
        i++
        return true
    })
    out = append(out, ']')
    return string(out)
})
tidwall commented 2 years ago

For this json:

{
  "first": "Andy",
  "fav.movie": "Deer Hunter",
  "city": "Phoenix",
  "friends": [
    {"first": "Dale"},
    {"movie":"Overboard","first": "Roger"},
    {"city":"Anchorage","first": "Jane", "last": "Murphy"}
  ]
}
[{movie:fav\.movie,city},friends].@fill
[
  {"city":"Phoenix","movie":"Deer Hunter","first": "Dale"},
  {"city":"Phoenix","movie":"Overboard","first": "Roger"},
  {"movie":"Deer Hunter","city":"Anchorage","first": "Jane", "last": "Murphy"}
]