Open jhorstmann opened 4 months ago
The java implementation seems to clearly contradict the spec here
public void writeBool(boolean b) throws TException {
if (booleanField_ != null) {
// we haven't written the field header yet
writeFieldBeginInternal(booleanField_, b ? Types.BOOLEAN_TRUE : Types.BOOLEAN_FALSE);
booleanField_ = null;
} else {
// we're not part of a field, so just write the value.
writeByteDirect(b ? Types.BOOLEAN_TRUE : Types.BOOLEAN_FALSE);
}
}
And here, called via getCompactType
and writeCollectionBegin
.
private static final byte[] ttypeToCompactType = new byte[18];
static {
ttypeToCompactType[TType.STOP] = TType.STOP;
ttypeToCompactType[TType.BOOL] = Types.BOOLEAN_TRUE;
...
}
The spec says
and
But it seems the
ColumnIndex::null_pages
field inalltypes_tiny_pages.parquet
, written byparquet-mr version 1.12.0-SNAPSHOT (build 6901a2040848c6b37fa61f4b0a76246445f396db)
encodes the element type as 1 and contains elements with value 2.We probably need to be lenient and support both, decoding element values as
byte_value == 1
.