sunchao / parquet-rs

Apache Parquet implementation in Rust
Apache License 2.0
149 stars 20 forks source link

Writing repeated group within a group? #208

Closed jcgomes90 closed 5 years ago

jcgomes90 commented 5 years ago

I have a schema as follows:

let message_type = "message schema
    {
        REQUIRED GROUP values (LIST) {
            REPEATED GROUP grp (LIST){
                REQUIRED BYTE_ARRAY a (utf8);
                REQUIRED BYTE_ARRAY b (utf8);
                REQUIRED BYTE_ARRAY c (utf8);
            }
        }
    }";

I am writing the columns as such:

let val = ByteArray::from("one");
let val2 = ByteArray::from("two");
let mut row_group_writer = writer.next_row_group().unwrap();
    while let Some(mut col_writer) = row_group_writer.next_column().unwrap() {
        match col_writer {
            ColumnWriter::ByteArrayColumnWriter(ref mut typed_writer) => {
              typed_writer.write_batch(&[val, val2],Some(&[1, 1]),Some(&[0, 0])).unwrap();
            },
            _ => { }
        }
        row_group_writer.close_column(col_writer).unwrap();
    }

That gives me the parquet output like:

values: .grp ..a = one ..b = one ..c = one

values: .grp ..a = two ..b = two ..c = two

Wheras I want something like:

values: .grp ..a = one ..b = one ..c = one .grp ..a = two ..b = two ..c = two

How is the right way to do this?