jqlang / jq

Command-line JSON processor
https://jqlang.github.io/jq/
Other
30.38k stars 1.58k forks source link

how to define variables recursively? #1670

Closed quchunguang closed 6 years ago

quchunguang commented 6 years ago

How to define variables recursively? Is there any idea?

I want to query this,

.[]|{"name": .basic.name, "v1": .notinjson.v1}

in which, .notinjson.v1 is not in input directly, but it can be calculated by,

.[]|.notinjson.v2+1

in which, .notinjson.v2 is not in input directly too, but it can be calculated by,

.[]|.basic.v2+1

How can I put these three steps in one command? thanks.

pkoppstein commented 6 years ago

The following pipeline seems to meet the requirements, but without knowing much about the input and the expected output, it's hard to say:

.[]
| .notinjson.v2 = (.basic.v2+1)
| .notinjson.v1 = (.notinjson.v2+1)
| {"name": .basic.name, "v1": .notinjson.v1}

With a bit of work, this could be done using a recursively-defined jq function up to some arbitrary depth.

For future reference, please ask usage questions at stackoverflow.com using the jq tag. Please also follow the https://stackoverflow.com/help/mcve guidelines as applicable, both here and there.

quchunguang commented 6 years ago

thanks.

quchunguang commented 6 years ago

This might be a BUG, therefor I reposted here again.

By jq '.[0]' csv2json_output.json,

{
  "basic": {
    "name": "Kevin S. Sa",
    "class": "class Q2018",
    "phone": "1834566"
  },
  "status": {
    "current": 0.375,
    "plan": 0.787
  },
  "lastlogin": "2018-04-25 16:11:59"
}

The order of defination seems sensitive. By ./jq '[.[0] | .calc.class.v1=(.status.current+1)|.calc.class.v2=(.status.plan + .calc.class.v1) | {calc: {"v3": (.calc.class.v1 + .calc.class.v2)}}]' csv2json_output.json, I get the right answer,

[
  {
    "calc": {
      "v3": 3.537
    }
  }
]

But by ./jq '[.[0] | .calc.class.v2=(.status.plan + .calc.class.v1) | .calc.class.v1=(.status.current+1)| {calc: {"v3": (.calc.class.v1 + .calc.class.v2)}}]' csv2json_output.json, It seems here .calc.class.v2=.status.plan+0,

[
  {
    "calc": {
      "v3": 2.162
    }
  }
]

Must I deal with the dependency by my myself? It seems not right.

wtlangford commented 6 years ago

The order is important because you're defining a series of steps to take.

1) set .calc.class.v1 = .status.current + 1
2) set .calc.class.v2 = .status.plan + .calc.class.v1
3) set .calc.class.v3 = .calc.class.v1 + .calc.class.v2

If I swap steps 1 and 2, and unset values act are null (which acts like 0 for the purposes of addition!), then clearly .calc.class.v2 will be .status.plan + 0.

This is not a bug.

quchunguang commented 6 years ago

I think this is a design problem. I can look on those dependence as a DAG, so that the order could be not important. I wrote a little demo with python to deal with this. Any way, thanks.

import re
from collections import defaultdict

dic = {
    '.calc.aa': '[] | {"a": .calc.dd, "b": (.calc.bb/.calc.cc)}',
    '.calc.bb': '[] | {"a": .calc.dd, "b": (.calc.dd/.calc.dd)}',
    '.calc.ee': '[] | {"a": .calc.bb, "b": (.ee/.dd)}',
    '.calc.ff': '[] | {"a": .calc.aa, "b": (.ee/.dd)}',
    '.calc.cc': '[] | {"a": .calc.bb, "b": (.ee/.dd)}',
    '.calc.dd': '[] | {"a": .dd, "b": (.ee/.dd)}',
}

# graph = {'A': ['B', 'C', 'D'], 'B':['D'], 'C':['B'], 'D':[]}
graph = defaultdict(list)
ll = []

def gen_graph(start_node):
    if start_node in graph:
        return

    if start_node not in dic:
        raise Exception(start_node + " not in dict")

    pattern = r"(\.calc\.[a-zA-Z][\.a-zA-Z0-9_]*)"
    graph[start_node] = list(set(re.findall(pattern, dic[start_node])))

    for key in graph[start_node]:
        gen_graph(key)

def is_leaf(deps):
    for node in deps:
        if node not in ll:
            return False
    return True

def topological_sort(start_node):
    deps = graph[start_node]

    for node in deps:
        if node in ll:
            continue
        topological_sort(node)

    if is_leaf(deps):
        print(start_node)
        ll.append(start_node)

if __name__ == '__main__':
    try:
        gen_graph('.calc.aa')
    except Exception as e:
        print(e)
        exit()
    topological_sort('.calc.aa')

# OUTPUT:
#.calc.dd
#.calc.bb
#.calc.cc
#.calc.aa
nicowilliams commented 6 years ago

Works for me with jq 1.4, 1.5, and master:

EDIT: Er, never mind, I mis-copy/pasted.

nicowilliams commented 6 years ago

Yes, in jq things happen in order. It's very sequential, though it does have backtracking.