Unify the Parquet dictionary value decoders

There are many dictionary ID value decoders in the Parquet batch reader. They usually allocates a buffer in every readNext call and it is bad for reliability and performance. There is no need to create a separate decoder and add unnecessary memory allocation and memory copies. It would be nice to send a new PR to unify existing RLE dictionary decoders. After all, dictionary IDs can only be RLE/BP encoded, and is not relevant to the data column types.

Ref: https://parquet.apache.org/docs/file-format/data-pages/encodings/ "Data page format: the bit width used to encode the entry ids stored as 1 byte (max bit width = 32), followed by the values encoded using RLE/Bit packed described above (with the given bit width)."

See https://github.com/prestodb/presto/pull/23584

prestodb / presto

Unify the Parquet dictionary value decoders #23612