noborus / trdsql

CLI tool that can execute SQL queries on CSV, LTSV, JSON, YAML and TBLN. Can output to various formats.
https://noborus.github.io/trdsql/
MIT License
2.01k stars 77 forks source link

Grouping entries with same key inside json output #102

Open sbeaupre opened 4 years ago

sbeaupre commented 4 years ago

This one is related to a comment in issue #100

When targetting json, a flat array with objects are returned, but when using joins, it would be interesting to return an array with a list of objects, each containing an array of objects coming from the joined tables.

Example:

❯ docker run --rm -it -v $(pwd):/tmp noborus/trdsql -ojson "SELECT u.c1 as id,u.c2 as name,h.c2 as date FROM /tmp/user.csv as u LEFT JOIN /tmp/hist.csv as h ON(u.c1=h.c1)" > out.json

results in:

❯ cat out.json
[
  {
    "date": "2017-7-10",
    "id": "1",
    "name": "userA"
  },
  {
    "date": "2017-7-10",
    "id": "2",
    "name": "userB"
  },
  {
    "date": "2017-7-11",
    "id": "2",
    "name": "userB"
  }
]

But it would be cool to have the following output, using the filenames or maybe aliases ('h' as alias for 'hist.csv') as object keys for the nested objects:

[
  {
    "date": "2017-7-10",
    "id": "1",
    "name": "userA"
  },
  {
    "id": "2",
    "name": "userB",
    "h": [
      {
        "date": "2017-7-10"
      },
      {
        "date": "2017-7-11"
      }
    ]
  }
]

Having multiple left joins gives more nested objects on the same level as 'h' above.

noborus commented 4 years ago

Thank you for a good suggestion.

But for now, I have no idea how to implement it. I'm thinking about using SQL JSON functions to combine nested outputs.

sbeaupre commented 4 years ago

You're right, it is probably easier to do this (when using column names in first row):

-oraw -ih "SELECT json_object('id', u.id, 'name', u.name, 'dates',json_group_array(h.date)) FROM ./user.csv AS u LEFT JOIN ./hist.csv as h  ON (u.id=h.id) GROUP BY u.id"

which results in

{"id":"1","name":"userA","dates":["2017-7-10"]}
{"id":"2","name":"userB","dates":["2017-7-10","2017-7-11"]}