chainer / chainermn

ChainerMN: Scalable distributed deep learning with Chainer
https://chainer.org
MIT License
207 stars 57 forks source link

Fix MultiNodeIterator for paired datasets #248

Closed levelfour closed 6 years ago

levelfour commented 6 years ago

The old MultiNodeIterator fails to handle paired datasets, i.e.,

[(x1, y1), (x2, y2), ... , (xN, yN)]

In this implementation, first the master process confirms whether the dataset is paired or not, and broadcast twice (for the sequence [x1, x2, ... , xN] and [y1, y2, ... , yN]) in paired case.

keisukefukuda commented 6 years ago

Can one of the admins verify this patch?

kuenishi commented 6 years ago

This fix itself seems good, but I think this iterator must explicitly fail with clear error message when other data types than list of arrays or list of pairs. @iwiwi indicated that another standard type is list of dicts, written here , but there also could be arbitrary types when user-defined convert function and user-defined Dataset type are defined. This could be another issue of checking inputs> @levelfour

kuenishi commented 6 years ago

Extended work : https://github.com/chainer/chainermn/issues/252