NervanaSystems / neon

Intel® Nervana™ reference deep learning framework committed to best performance on all hardware
http://neon.nervanasys.com/docs/latest
Apache License 2.0
3.87k stars 811 forks source link

Predictions for number of samples < be.bsz #227

Open iaroslav-ai opened 8 years ago

iaroslav-ai commented 8 years ago

In order to obtain predictions of some neural network in neon, I use

model.get_outputs(ArrayIterator(X_predict))

where X_predict is numpy N x M matrix, M is number of features and N is number of samples for which I want to obtain predictions. It is a common situation in my case when I just want to obtain predictions for one input, both during training and testing of the neural net. It seems however that N should be always at least batch size specified in backend - otherwise I get exception from ArrayIterator. Should I always write a separate python module for predictions where bsz is set to 1? Would be nice to have a function which can handle different batch sizes for prediction automatically.

hanlint commented 8 years ago

Layer buffers in neon are pre-allocated during model creation for a particular batch size. To run prediction on samples < be.bsz, you could either regenerate the model with a new batch size or just pad your M x samples matrix into a larger M x be.bsz matrix where you can ignore the padded entries. Where prediction speed is not critical, the latter approach is easier.

madhurgoel commented 8 years ago

Could you please show how to create a Model with different batch size, as I guess batch_size is fixed by backend, wrt def set_batch_size(self, N): """ Set the actual minibatch size, so even though the buffers are allocated considering excessive padding, the processing for some layers may be shortened. Currently most of the neon layers don't use that to control the processing. The interface is here only for when someone wants to set that information and experiment. """ 1) has it been implemented, to actually doing the whole exercise compute efficient ? 2) How would I copy weights from the old Model(lager batch size) to the new Model(batch size ==1) e.g. do you think the combination of get_description(get_weights=True, keep_states=False) and deserialize(load_states=False) would work ?