uber / petastorm

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Apache License 2.0
1.8k stars 284 forks source link

Fix incorrect counting of number of row-groups per piece. #477

Closed selitvin closed 4 years ago

selitvin commented 4 years ago

This bug manifests in either:

This issue occured when:

The issue was in introduced in petastorm 0.7.7

Resolves #447

WeichenXu123 commented 4 years ago

Good work!