Closed jpmckinney closed 3 years ago
The style with INNER JOIN LITERAL
from https://github.com/MaayanLab/signature-commons-metadata-api/blob/master/src/migration/1568055725225-jsonb-deep-key-value.ts is about 10% faster:
CREATE OR REPLACE FUNCTION flatten (jsonb)
RETURNS TABLE (
path text,
object_property integer,
array_item integer)
LANGUAGE 'sql'
PARALLEL SAFE
AS $$
-- https://www.postgresql.org/docs/current/queries-with.html
WITH RECURSIVE t (
KEY,
value,
object_property,
array_item
) AS (
SELECT
j.key,
j.value,
1,
0
FROM
jsonb_each($1) AS j
UNION ALL (
SELECT
CASE tt.path_count
WHEN 1 THEN
concat(t.key, '/', tt.key)
ELSE
t.key
END,
tt.value,
tt.path_count,
tt.array_count
FROM
t
INNER JOIN LATERAL (
SELECT
ttt.key,
ttt.value,
1 path_count,
0 array_count
FROM
jsonb_each(t.value) AS ttt
WHERE
jsonb_typeof(t.value) = 'object'
UNION ALL
SELECT
'',
ttt.value,
0 path_count,
1 array_count
FROM
jsonb_array_elements(t.value) AS ttt
WHERE
jsonb_typeof(t.value) = 'array'
AND jsonb_typeof(t.value -> 0) = 'object'
) AS tt ON TRUE
)
)
SELECT
KEY AS path,
object_property,
array_item
FROM
t;
$$;
In summary:
The fhirbase/master-class is fastest, which is not that surprising, as it's also the simplest (no subquery, no join).
We can maybe reduce the number of iterations performed by WITH RECURSIVE
by calling jsonb_each
on values of jsonb_array_elements
like in https://github.com/elixir-oslo/trackfind/blob/7559e5f9f517b44b93d76caa762424e8c6811444/src/main/resources/schema.sql#L170-L212, if we had a way to also track the number of array entries.
Update: Tried:
SELECT
prev.key || '/' || next.key,
next.value,
1,
1
FROM
prev,
jsonb_array_elements(prev.value) entry,
jsonb_each(entry) next
WHERE
jsonb_typeof(prev.value) = 'array'
AND jsonb_typeof(entry) = 'object'
But it's not possible to get results like:
collection_id,release_type,path,object_property,array_count,distinct_releases
1,release,awards,100,301,100
1,release,awards/amendment,301,0,100
if we try to accumulate object_property and array_count in the same iteration. The above yields:
1,release,awards,100,0,100
1,release,awards/amendment,301,301,100
Some similar implementations:
https://github.com/fhirbase/master-class/blob/93692db5094e1ca56aba7c0adb671fa6104f4e3e/sql-result https://github.com/cajnoj/sql/blob/4d3b44ddb8009212abeba5e07da4a15f22348d03/PostgreSQL/JSON/traverse_jsonb.sql https://github.com/Anonim147/GoDAPP/blob/ce79df128850dc211681936ada3e1bcc0565f7e3/sql/recursive_with_array_solved.sql https://github.com/MaayanLab/signature-commons-metadata-api/blob/2f83022cb8d41f06caf8e14ab38d94393c8a37ec/src/migration/1568055725225-jsonb-deep-key-value.ts