I have a use case that I'm not sure if I'll be able to achieve with this library without forking it. Basically, I have JSON objects that look like this:
{
"foo": "bar",
"baz": {
"qux": "fus"
}
}
What I would like to do is something like this:
CREATE EXTERNAL TABLE xxx (
foo STRING,
qux STRING
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES (
'mapping.qux'='baz.qux'
)
Basically, I want to access the inner qux field from a top-level column name.
I have previously achieved this by using a second table where I do the usual colName struct<field:type> stuff, and then select the nested properties into the final table. However, I would like to avoid this extra step as the table will read from a gargantuan amount of data and it would just make things super slow.
I have a use case that I'm not sure if I'll be able to achieve with this library without forking it. Basically, I have JSON objects that look like this:
What I would like to do is something like this:
Basically, I want to access the inner
qux
field from a top-level column name.I have previously achieved this by using a second table where I do the usual
colName struct<field:type>
stuff, and then select the nested properties into the final table. However, I would like to avoid this extra step as the table will read from a gargantuan amount of data and it would just make things super slow.