sunchao / parquet-rs

Apache Parquet implementation in Rust
Apache License 2.0
149 stars 20 forks source link

Writing Optional Field Values #210

Closed jcgomes90 closed 5 years ago

jcgomes90 commented 5 years ago

Whenever I create a schema with an OPTIONAL field and insert data in that field, it doesn't seem to get written to the parquet file. I have a schema as follows:

let message_type = "message schema
            {
                REQUIRED INT64 a;
                REQUIRED INT64 b;
                REQUIRED GROUP listA (LIST) {
                          REPEATED GROUP listB (LIST) {
                                    OPTIONAL INT64 c;
                          }
                 }
}";

I am writing the data where rep = 1 when more than one variable c:

typed_writer.write_batch(&[val],Some(&[1]),Some(&[rep])).unwrap();

Any idea why this is the case? Am I not writing the optional fields the right way?

sunchao commented 5 years ago

@jcgomes90 : I think the definition level should be 2, because both repeated and optional contribute to definition levels. If you only have one value, then the repetition level should be 0, since it is the start of a new list.

Therefore, if you do:

typed_writer.write_batch(&[val],Some(&[2]),Some(&[0])).unwrap();

then it should write some valid data.