Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Apache License 2.0
1.76k
stars
281
forks
source link
FutureWarning: 'ParquetDataset.partitions' attribute is deprecated as of pyarrow 5.0.0 and will be removed in a future version. #800
When I call make_reader, I keep getting the following warning in each epoch. Will this be fixed in the future?
Code
from petastorm import make_reader
from petastorm.pytorch import DataLoader
reader = make_reader(
dataset_url=f"file://train.parquet",
shuffle_rows=False
)
return DataLoader(reader, batch_size=128)
Warning
/opt/conda/lib/python3.9/site-packages/petastorm/py_dict_reader_worker.py:267: FutureWarning: 'ParquetDataset.partitions' attribute is deprecated as of pyarrow 5.0.0 and will be removed in a future version. Specify 'use_legacy_dataset=False' while constructing the ParquetDataset, and then use the '.partitioning' attribute instead.
Here is my version.
pyarrow: 13.0.0
petastorm: 0.12.1
When I call
make_reader
, I keep getting the following warning in each epoch. Will this be fixed in the future?Code
Warning
Here is my version. pyarrow: 13.0.0 petastorm: 0.12.1