Open c-nv-s opened 1 year ago
@c-nv-s Interesting idea. Do you have a concept in your mind of what the syntax for that might look like?
whatever keeps it most transparent for the end user that you can think of.
In postgres if you have a table called the_customer_orders_table
and it has a jsonb column called thedata
you reference a field something like this:
select thedata->'items'->'product_name' from the_customer_orders_table ;
in sq this could look something like this:
sq '.the_customer_orders_table | .thedata::jsonb.items.product_name, .thedata::jsonb.items.product_quantity, etc '
but if you think that could get very verbose with all the selectors, you could go fully declarative like this guy haha https://github.com/xcite-db/Unquery/blob/main/TUTORIAL.adoc
but if you think that could get very verbose with all the selectors, you could go fully declarative like this guy haha https://github.com/xcite-db/Unquery/blob/main/TUTORIAL.adoc
I'd never seen that tool before. There are dozens of us!
whatever keeps it most transparent for the end user that you can think of. In postgres if you have a table called
the_customer_orders_table
and it has a jsonb column calledthedata
you reference a field something like this:select thedata->'items'->'product_name' from the_customer_orders_table ;
in sq this could look something like this:
sq '.the_customer_orders_table | .thedata::jsonb.items.product_name, .thedata::jsonb.items.product_quantity, etc '
Thanks for the input @c-nv-s . The latter suggestion with the double colons (.thedata::jsonb...
) prob not going to be viable, because the recently released column aliases feature uses the colon, e.g.
$ sq '.actor | .first_name:given_name'
The postgres "pointer style" might work, e.g.
$ sq '.actor.jsonblob->age'.
That would combine with column aliases to look like:
$ sq '.actor.jsonblob->age:how_old'
That doesn't seem too bad. Now, to combine it with whitespace names:
$ sq '.actor."json blob"->"person age":how_old'
Again, that could work.
For further nested JSON:
$ sq '.actor.jsonblob->person.age'
That seems pretty ok.
I'll have to give it some more noodling. Let me know if you have additional thoughts.
Or maybeeee something like:
$ sq .actor.jsonblob[person.age]
Although I'm not sure I like that as much.
One potential downside to the arrow is that I was considering using it to incorporate insertion into the query language.
Right now you can do:
$ sq .actor --insert @othersrc.table
The theory being that MAYBE the insert could be implemented as:
$ sq '.actor -> @othersrc.table'
But I'm pretty lukewarm about it. There's other options too, like
$ sq '.actor | insert(@othersrc.table)'
Which might be clearer. Again, not sure if we'll take this route at all, or if the query language should stay exclusively for querying.
Another thing that drew me to sq was the prospect of having a tool that work similarly to jq which eases the brain tax of having to learn multiple languages and thier individual quirks. so maybe yoou could consider the question "how can I keep this feature most jq-like?" other than that i'm pretty easy-going on what you choose. https://github.com/itchyny/gojq
fyi (just in case it wasn't clear) I mentioned this because I would say that your last example of the possible insert
command is definitely the more clear and jq
-like solution
$ sq '.actor | insert(@othersrc.table)'
any more thoughts on this request? this feature and inline datasources are the final puzzle piece for me with this tool.
@c-nv-s I'm finally investigating JSON support (can you tell I have thanksgiving week off? 😉)...
Quick question for you: do you use the "native" pg JSON operators in your queries, or jsonpath queries?
E.g. I've got a test table with info about GitHub repos (just because their API is a good source of JSON), looking like this:
$ sq inspect .github_json_awk5z6e6 -v
NAME TYPE ROWS COLS NAME TYPE PK
github_json_awk5z6e6 table 2 3 repo_name text
repo json
issues json
Do you do this?
select gh.repo::json->>'node_id'
from github_json_awk5z6e6 gh;
Or this?
select jsonb_path_query(gh.repo::jsonb, '$.node_id')->>0
from github_json_awk5z6e6 gh;
At this point, I'm leaning towards implementing the feature syntax via jsonpath, largely because that seems more portable across different databases, e.g. MySQL, SQL Server, etc. So, right now, the sq
syntax might end up looking like:
$ sq '.github_json_awk5z6e6 | .repo->>$.node_id'
But, I'm still in the investigation phase. What would be really helpful is if you could provide some sample queries that you use in your workflow 🙏. I need a feel for how complex they might be.
And any other early feedback is very welcome too.
Although, being that jsonpath is effectively another language to learn, maybe I'll wrap a jq veneer over the jsonpath. E.g. given this:
select jsonb_path_query(issues::jsonb, '$[*].title')->>0
from github_json_awk5z6e6;
In jsonpath style, that would be:
$ sq '.github_json_awk5z6e6 | .issues->$[*].title'
But, in jq style, it would be:
$ sq '.github_json_awk5z6e6 | .issues->.[].title'
hey neil, that jq style veneer does look nice and intuitive.
To answer your first question, I use the first mentioned query style, however my intention is to to get json responses as quickly as I can out of postgres and into jq, so my queries to postgres would usually aim for the most sensible top level node of the json and then the rest of the data manipulation would be done with jq.
psql -h localhost -p 5432 -U mypostgresuser -d shoppingappdb -tc "select jsonb_agg(data->'items'->'product') from theCollectionOfOrders " | jq ' ... '
I'll see if I can paste a snippet from the little postgres json cheatsheet I collated from various online sources
some snippets from my cheatsheet sourced online (was never intended for public viewing but what the hell)
CREATE TABLE theOrdersTable (
id integer NOT NULL PRIMARY KEY,
data jsonb NOT NULL
);
INSERT INTO theOrdersTable (data)
VALUES('{ "Id": 53635, "customer": "Lily Bush", "items": {"product": "Diaper","qty": 24}}'),
('{ "Id": 87644, "customer": "Josh William", "items": {"product": "Toy Car","qty": 1}}'),
('{ "Id": 44487, "customer": "Mary Clark", "items": {"product": "Toy Train","qty": 2}}');
# three ways to get whole table as a json array
select to_jsonb(theOrdersTable) from theOrdersTable ;
select jsonb_agg(theOrdersTable) from theOrdersTable ;
select row_to_json(theOrdersTable) from theOrdersTable ;
# get all the json objects from the data column with pretty or regular json output
select jsonb_pretty(data) from theOrdersTable ;
select data::jsonb from theOrdersTable ;
# get a specific field in the json objects
select data->'items'->'product' from theOrdersTable ;
# build a json response object yourself manually
select jsonb_pretty( jsonb_agg( jsonb_build_object('id', id, 'items', data->'items'))) from theOrdersTables ;
# get the total number of objects which match a given json path (filter)
select * from theOrdersTable where jsonb_path_exists(data, '$.customer') ;
# the total number of objects but filter on a condition
select * from theOrdersTable where jsonb_path_exists(data, '$.customer ? (@ == "John Doe")') ;
# combine filter with selection
select data->'items'->'product' from theOrdersTable where jsonb_path_exists(data, '$.customer ? (@ == "John Doe")') ;
[452/1868]
# select and limit results
select data->'items'->'product' from theOrdersTable limit 2 ;
# expand the outermost JSON object into a set of key-value pairs
select jsonb_each(data) from theOrdersTable ;
# combine filter and object selection (!!jsonb_each(x) x MUST MATCH TO AN OBJECT IN THE JSON NOT VALUE!!)
select jsonb_each(data->'items') from theOrdersTable where jsonb_path_exists(data, '$.customer ? (@ == "John Doe")') ;
# get the keys of each object
select jsonb_object_keys(data->'items') from theOrdersTable where jsonb_path_exists(data, '$.customer') ;
# get the typeof the item
select jsonb_typeof(data->'items') from theOrdersTable where jsonb_path_exists (data, '$.customer') ;
# arrays in the json data can also be accessed. e.g.:
SELECT count(*) FROM thCollectionOfPosts WHERE jsonb_path_exists(data, '$.path.to.theJSONArray[*] ? (@ == "someValueYouWantToMatchOn")'
)
# building complex nested queries
SELECT
jsonb_pretty(
jsonb_build_object(
'theAmountPurchased', json_agg(theOrdersTable.data->'items'->'qty'),
'theFavoriteProduct', json_agg(theOrdersTable.data->'items'->'product')
)
)
FROM theOrdersTable
# or with limiting from result of another query
select row_to_json(theOrdersTable) from ( select id, to_jsonb(data->'items'->'product') from theOrdersTable ) theOrdersTable limit 2 ;
select jsonb_pretty( row_to_json(theOrdersTable)::jsonb) from ( select jsonb_build_object('theData', data->'items') from theOrdersTabl
e limit 2 ) theOrdersTable ;
## CRUD OPERATIONS
## UPDATING
# update type 1 - add a new outer-level field and value
UPDATE theOrdersTable
SET data = data || '{"the_delivery_priority": ["Next Day"]}'
WHERE data->>'customer'::text IN ( 'John Doe', 'Lily Bush');
# update type 2 - set/update existing field (creating it if it doesn't already exist)
# in this example we create an order_ref field in the json from the table's own id field
UPDATE theOrdersTable
SET data = jsonb_set(
data,
'{order_ref}',
id::TEXT::JSONB,
true); # this boolean sets whether to create if it doesn't already exist which is true by default so can be omitted
# if for some reason you had the 'data' column as a text field type you would have to convert to json before updating e.g.
UPDATE "theOrdersTable"
SET "data"=jsonb_set("data"::jsonb, '{url}', '"images/0005.jpg"', true)
WHERE "details"::json->>'name'='Eiffel Tower';
# adding a value to a json array
UPDATE USER_CONFIGURATIONS
SET DATA =
jsonb_set(DATA, '{configurations}'::text[], DATA ->'configurations' || '{"name":"firstconf", "theme":"dark", "textsize":"large"}'::json
b,
true)
WHERE (DATA ->> 'userid') = 'user@example.com';
# or
UPDATE airports
SET airport =
JSONB_SET(airport, '{airport_keywords}',
(SELECT (airport -> 'airport_keywords') || TO_JSONB('BWI'::TEXT) FROM airports WHERE airport ->> 'ident' = 'KBWI'),
false)
WHERE airport ->> 'ident' = 'KBWI';
# change value of a field in an array
UPDATE USER_CONFIGURATIONS
SET DATA = JSONB_SET(
DATA,
'{configurations}'::text[],
(((DATA -> 'configurations')
-
(SELECT i
FROM generate_series(0, jsonb_array_length(data->'configurations') - 1) AS i
WHERE (data->'configurations'->i->>'name' = 'firstconf')))::jsonb
||
'{"name":"first", "theme":"light"}'::jsonb))
WHERE (DATA ->> 'userid') = 'user@example.com';
# change value of a field in a nested document
UPDATE airports
SET airport = JSONB_SET(
airport,
'{runways}',
(WITH heli_runs AS (
SELECT airport -> 'runways' AS heli_run
FROM airports
WHERE airport ->> 'ident' = 'MD25')
SELECT JSONB_SET(
JSONB_ARRAY_ELEMENTS(heli_run),
'{surface}',
TO_JSONB('asph'::text),
false)
FROM heli_runs),
false)
WHERE airport ->> 'ident' = 'MD25';
## DELETING
# delete type 1 - deleting an outermost field
UPDATE theOrdersTable set data = ( data - 'delivery_information' ) ;
# or remove an entire entry with the DELETE command
DELETE FROM theOrdersTable
WHERE (DATA->>'customer') = 'John Doe'::text
# delete type 2 - deleting inner fields and arrays
# remove the second element from an array
UPDATE USER_CONFIGURATIONS
SET DATA = DATA #- '{configurations, 1}'::text[]
WHERE (DATA ->> 'userid') = 'anotheruser@example.com';
# or remove array element with key name 'myconf2'
UPDATE USER_CONFIGURATIONS
SET DATA = DATA #-
('{configurations,' || (
SELECT i
FROM generate_series(0, jsonb_array_length(DATA->'configurations') - 1) AS i
WHERE (DATA->'configurations'->i->>'name' = 'myconf2')
) || '}')::text[]
WHERE (DATA ->> 'userid') = 'anotheruser@example.com';
# or
UPDATE airports
SET airport =
JSONB_SET(
airport,
'{airport_keywords}',
(WITH kjfk_keys AS (
SELECT JSONB_ARRAY_ELEMENTS(airport -> 'airport_keywords') keywords
FROM airports
WHERE airport ->> 'ident' = 'KJFK')
SELECT JSONB_AGG(keywords)
FROM kjfk_keys
WHERE keywords::TEXT != '"Idlewild"'))
WHERE airport ->> 'ident' = 'KJFK';
# or we can delete the entire array by updating it with an empty array
UPDATE USER_CONFIGURATIONS
SET DATA =
jsonb_set(DATA, '{configurations}'::text[], '[]'::jsonb) WHERE (DATA ->> 'userid') = 'example@example.com';
would be nice if sq could reference fields from json data that has been saved in a table under the
jsonb
orjson
type in postgres.