uber / petastorm

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Apache License 2.0
1.8k stars 284 forks source link

What is the difference between petastorm and horovod? #464

Closed dclong closed 4 years ago

dclong commented 4 years ago

Both products are from Uber and both claims to do distributed training. Just curious about it.

selitvin commented 4 years ago

Petastorm is a parquet access library that may be used from TF, PyTorch or pure python to load data from parquet stores directly into ML framework.

Horovod is a library that enables distributed learning of DL models - it coordinates model coefficient updates on multiple nodes involved in training.