Currently, we keep all STRING and SYMBOL values in UTF-16 encoding on disk. This requires unnecessary re-encoding since all values sent from ILP, HTTP, and PGWire clients are in UTF-8.
We should switch from UTF-16 to UTF-8 in all files with the necessary database files migration. Once mmapped, the in-memory layout for a STRING column should be as close to Apache Arrow as possible.
Currently, we keep all STRING and SYMBOL values in UTF-16 encoding on disk. This requires unnecessary re-encoding since all values sent from ILP, HTTP, and PGWire clients are in UTF-8.
We should switch from UTF-16 to UTF-8 in all files with the necessary database files migration. Once mmapped, the in-memory layout for a STRING column should be as close to Apache Arrow as possible.
This is a preliminary step for Cold Storage (#5).