Open brijos opened 2 months ago
SELECT
json_extract(myblob, '$.name') AS name,
json_extract(myblob, '$.projects') AS projects
FROM dataset
it would help a lot to support json_extract
function to manipulate json string field as shown above.
@anasalkouz @YANG-DB @brijos
I have created a working prototype for the json_extract
function.
This is currently implemented in the sql sub-project to make the json functions available not only as PPL command. In other words: The function can be used (like any other built in function) in sql and ppl.
The proposed (and so far implemented syntax) is:
json_extract(<json>,<path>)
<json>
Json as string. From an table cell or as literal (mandatory)
<path>
a json path or a json pointer expression (mandatory).
The function returns the result as string (scalar value or full json)
An error is thrown when:
No error is thrown when:
Examples:
#JSON Path
select json_extract('{\"name\":\"saly\"}', '$.name')
#JSON Pointer
select json_extract('{\"name\":\"saly\"}', '/name')
@anasalkouz @YANG-DB @rupal-bq any comments on the proposal so far?
Can we perhaps specify a document ID to use as the json for the query? I've got a lot of json blobs that I'd love to search through instead of breaking them up before ingest.
Can we perhaps specify a document ID to use as the json for the query? I've got a lot of json blobs that I'd love to search through instead of breaking them up before ingest.
can you post an example how this can look like?
Is your feature request related to a problem? Community members have asked for easier JSON parsing and analysis capabilities which allow them to not only search JSON logs and extract fields without writing complex parse expressions, but perform computations on JSON array values, such as finding the sum of all values in the array, where the number of elements in the array is not known.
What solution would you like? Allow users to extract and transform data from JSON-formatted events and fields. Users should be able to extract all values in an array by specifying a wildcard for the individual element position and doing an aggregation operation on them. Users should be able to extract: 1/single or multiple top level fields 2/nested fields 3/keys in arrays and perform operations on the values.
** Examples ***
What alternatives have you considered? No other solutions are available in PPL
Do you have any additional context? No