uber / petastorm

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Apache License 2.0
1.8k stars 284 forks source link

Support pyarrow 0.15 API #438

Closed selitvin closed 5 years ago

xhochy commented 5 years ago

Note that we have downstream integration tests in Arrow to detect incompatabilities much earlier, e.g. see https://github.com/apache/arrow/blob/master/dev/tasks/tasks.yml#L1553-L1568 It might be useful for you to also add a job for petastorm there.