Reading the spec carefully I see the following paragraph
One important thing to remember to understand the examples is that not every level of the tree needs a new definition or repetition level. Only repeated fields increment the repetition level, only non-required fields increment the definition level. As those levels are very small bounded values they can be encoded efficiently using a few bits.
Required fields are always defined and do not need a definition level. Non repeated fields do not need a repetition level.
This means that any path to a leaf node that has all path element as optional: false can only have a definition level of zero. (each definition level higher up needs an optional: true in the path)
However when I look at one of the materialize tests in parquetjs I see:
Nothing in the path is optional, however many of the dlevels are non zero. If I change the dlevels to all zeros then the quantity data is not populated in the results records.
Is it possible that there is a discrepancy in the implementation?
I think I got it. Repeated is considered a Definition level in case it's empty. So repeated is essentially always optional in a sense that it can be empty.
Reading the spec carefully I see the following paragraph
This means that any path to a leaf node that has all path element as
optional: false
can only have a definition level of zero. (each definition level higher up needs anoptional: true
in the path)However when I look at one of the materialize tests in parquetjs I see:
Nothing in the path is optional, however many of the
dlevels
are non zero. If I change thedlevels
to all zeros then the quantity data is not populated in the results records.Is it possible that there is a discrepancy in the implementation?