brimdata / zed

A novel data lake based on super-structured data
https://zed.brimdata.io/
BSD 3-Clause "New" or "Revised" License
1.37k stars 67 forks source link

partial unions of unions in aggregations #3171

Open mccanne opened 2 years ago

mccanne commented 2 years ago

Aggregators that turn heterogeneously typed values into an aggregated union should handle union-of-unions by merging the union instead of created sub-unions inside of a union. This can happen when partial results are mixed together that encountered different subsets of underlying types for different subsections of data scanned.

philrz commented 2 years ago

Here's my attempt at a repro of this with Zed commit 2d52400.

$ cat data.zson
{num: 123 (int32) ((int32,float64))}
{num: 456.0 (float32) ((int32,float32))}
{num: 789.0 (float64) ((float32,float64))}

$ zq -version
Version: v1.2.0-29-g2d52400f

$ zq -Z 'union(num)' data.zson
{
    union: |[
        123 (int32) ((int32,float64)),
        456. (float32) ((int32,float32)),
        789. ((float32,float64))
    ]|
}

I think the correct output would be something like:

{
    union: |[
        123 (int32) ((int32,float32,float64)),
        456. (float32) ((int32,float32,float64)),
        789. ((int32,float32,float64))
    ]|
}