mitre / sparklyr.nested

A sparklyr extension for nested data
Apache License 2.0
31 stars 4 forks source link

array_zip support #15

Open mattpollock opened 6 years ago

mattpollock commented 6 years ago

from https://jira.apache.org/jira/browse/SPARK-23931

SELECT array_zip(ARRAY[1, 2], ARRAY['1b', null, '3b']); -- [ROW(1, '1b'), ROW(2, null), ROW(null, '3b')]

Shoot for something like

df %>%
  array_zip(fld1, flt2, fld4, .outname)

which retains all other fields, drops those three, and replaces them with the zipped version

mattpollock commented 6 years ago

tracking with #14

mattpollock commented 6 years ago

@javierluraschi do you plan to do anything along these lines (using Spark SQL updates in v2.4 to better handle nested data) in sparklyr?