Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Apache License 2.0
1.78k
stars
285
forks
source link
exposed pyarrow filters in the make_reader and make_batch_reader api #564
I just noticed this change. Thank you for supporting this. Is that possible to support passing filters for all columns by setting use_legacy_dataset as False in pyarrow.parquet.ParquetDataset[1]?
Codecov Report
92.05% <100.00%> (+0.05%)
95.27% <100.00%> (+0.03%)
90.24% <100.00%> (ø)
Continue to review full report at Codecov.