apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.79k stars 6.79k forks source link

train estimator on data stream #17124

Open liuzh47 opened 4 years ago

liuzh47 commented 4 years ago

Description

The fit() function of gluon.estimator only supports input from gluon.DataLoader. One limitation of gluon.DataLoader is that it accesses training data with index. Therefore, it is impractical to train the estimator on continuous data stream.

It will be great if the fit() function can support inputs from any iterable object instead of gluon.DataLoader and modify the following check.

if not isinstance(train_data, DataLoader):
       raise ValueError("Estimator only support input as Gluon DataLoader. Alternatively, you "
                        "can transform your DataIter or any NDArray into Gluon DataLoader. "
                        "Refer to gluon.data.dataloader")

References

  1. https://mxnet.incubator.apache.org/api/python/docs/_modules/mxnet/gluon/contrib/estimator/estimator.html
  2. https://beta.mxnet.io/_modules/mxnet/gluon/data/dataloader.html#DataLoader
leezu commented 4 years ago

The check should be removed. Python uses duck typing. MXNet claims to be a Python-first project now. So let's follow Python's best practice.