Closed sgoley closed 2 months ago
Alternative might be to build something like this as a whole separate UDF:
-- extract all key value pairs as an array from a json dict
-- input: json string with a dictionary
-- returns: list of struct <key, value>
CREATE TEMP FUNCTION EXTRACT_KV_PAIRS(json_str STRING)
RETURNS ARRAY<STRUCT<key STRING, value STRING>>
LANGUAGE js AS """
try{
const json_dict = JSON.parse(json_str);
const all_kv = Object.entries(json_dict).map(
(r)=>Object.fromEntries([["key", r[0]],["value",
JSON.stringify(r[1])]]));
return all_kv;
} catch(e) { return [{"key": "error","value": e}];}
""";
Source: https://medium.com/google-cloud/extracting-json-key-value-pairs-in-bigquery-1bb9d0ec0b6d
@plaflamme Just contributed the json_extract_key_value_pairs function in #408 which should help with this use case, please let us know if there are any issues
Issue:
bqutils.fn.json_extract_values
returns unusable values[object Object]
when children are objects / sub collections.Example:![image](https://user-images.githubusercontent.com/10283176/221068485-861c9522-554f-427b-977c-9feac39be3b5.png)
Sample query:
I used a single row here because the public dataset is 14GB and unpartitioned.
Desired Output:
bqutils.fn.json_extract_values
returns a usable value (even just a json string) of the contents of those objects.Thanks!