Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
New partitioned parquet file should be created locally or in S3
Actual Result
From Rust implementation:
DatasetError: Failed while saving data to data set
EagerPolarsDataset(file_format=parquet, filepath=/tmp/test.parquet,
load_args={}, protocol=file, save_args={'partition_by': ['dt1y']}).
'BytesIO' object cannot be converted to 'PyString'
From Pyarrow:
DatasetError: Failed while saving data to data set
LazyPolarsDataset(filepath=/tmp/test.parquet, load_args={}, protocol=file,
save_args={'pyarrow_options': {'compression': zstd, 'partition_cols': ['dt1y'],
'write_statistics': True}, 'use_pyarrow': True}).
Argument 'filesystem' has incorrect type (expected pyarrow._fs.FileSystem, got
NoneType)
Your Environment
Kedro version used (pip show kedro or kedro -V): 0.19.3
Polars: 1.9.0 and 1.6.0
Python version used (python -V): 3.11
Operating system and version: MacOS M1 using Docker Compose + Docker Desktop
Description
filesystem
within catalog (see code below) does not work.Context
Steps to Reproduce
Expected Result
Actual Result
From Rust implementation:
From Pyarrow:
Your Environment
pip show kedro
orkedro -V
): 0.19.3python -V
): 3.11