Is it possible to save custom metadata in the file when writing to a parquet file?
For example, with Dask, users can add custom metadata to the output files with this:
custom_metadata = {"custom_metadata": "my custom metadata"}
dataframe.to_parquet(path, custom_metadata=custom_metadata)
This code will add the custom metadata to the metadata of the saved parquet files, and the metadata then be read back in with pyarrow.parquet.read_metadata.
Is it possible to do something similar with Koalas? So far, I have not been able to find a way. I also attempted to manually update the metadata in the files after writing the parquet files with ks.DataFrame.to_parquet, but that is causing a checksum mismatch when trying to read the files back in to a dataframe with koalas.read_parquet.
Is it possible to save custom metadata in the file when writing to a parquet file?
For example, with Dask, users can add custom metadata to the output files with this:
This code will add the custom metadata to the metadata of the saved parquet files, and the metadata then be read back in with
pyarrow.parquet.read_metadata
.Is it possible to do something similar with Koalas? So far, I have not been able to find a way. I also attempted to manually update the metadata in the files after writing the parquet files with
ks.DataFrame.to_parquet
, but that is causing a checksum mismatch when trying to read the files back in to a dataframe withkoalas.read_parquet
.