lmjohns3 / downhill

Stochastic gradient routines for Theano
http://downhill.rtfd.org
MIT License
102 stars 24 forks source link

making the training data shared to speedup the training #5

Closed devdoer closed 9 years ago

devdoer commented 9 years ago

HI:

Using the shared function to make sure that the input dataset stored on the graphics device will double the training speed. Can downhill support this feature?

lmjohns3 commented 9 years ago

I can't quite follow what you mean here. Can you provide more details?

devdoer commented 9 years ago

HI:

As the theano document suggests "When using the GPU, float32 tensor shared variables are stored on the GPU by default to eliminate transfer time for GPU ops using those variables." , if we store the the training data in the gpu, and get the batch from the gpu using index , the performance is better.

Like this:

 instance = np.fromfile('DIM54_part-00003_0.ctr_bin', dtype=np.float32)
 instance = instance.reshape((-1,54))
 feat_mat = instance[:,1:]
 label_vec = instance[:,0]

 tmp1 = numpy.zeros((10,10), dtype=theano.config.floatX)
 tmp2 = numpy.zeros((10,), dtype=theano.config.floatX) 
 shared_D0 = theano.shared(tmp1, name = 'x', borrow = True)
 shared_D1 = theano.shared(tmp2, name = 'y', borrow = True)
 shared_D0.set_value(feat_mat, borrow=True)
 shared_D1.set_value(label_vec.astype(numpy.float32), borrow=True)

 train = theano.function(
             inputs=[index],
             outputs=[prediction, xent],
             updates={w:w-0.01*gw, b:b-0.01*gb},
             givens={x:shared_D0[index*batch_size:(1+index)*batch_size], y:shared_D1[index*batch_size:(1+index)*batch_size]},
             name = "train")
training_steps = instance.shape[0]/batch_size
for i in range(training_steps):
     pred, err = train2(feat_mat[i*batch_size:(i+1)*batch_size],label_vec[i*batch_size:(i+1)*batch_size])
lmjohns3 commented 9 years ago

Ah, thanks, I understand!

The short answer is, if this isn't already supported, then it should be.

One thing that might work (but I haven't tried it myself) is to use a callable when providing your training data. Something like this should get the point across:

import numpy as np
import theano
import downhill

class Batches(object):
    def __init__(self):
        self.length = 1000
        self.data = theano.shared(np.random.randn(self.length, 5).astype('f'))
        self.index = 0
        self.batch_size = 64

    def __call__(self):
        if self.index >= self.length:
            self.index = 0
        try:
            return self.data[self.index:self.index + self.batch_size]
        finally:
            self.index += self.batch_size

downhill.minimize(loss, train=Batches())

If something like this does work, it could be nice to integrate this more explicitly in downhill so that if you pass a theano.shared array as training data, then downhill would basically construct a callable like this internally and handle the batching for you.

devdoer commented 9 years ago

Thanks. Hoping it to be supported.

lmjohns3 commented 9 years ago

Just added some code to support shared variables as Dataset inputs. This will be included in the next release, probably in the next few weeks.

ttrouill commented 8 years ago

I just tried on downhill 0.3.1 by modifying the matrix factorization example, replacing the line: train=[y], by: train=[theano.shared(y)],

And had the following error:

Traceback (most recent call last): File "original.py", line 32, in monitor_gradients=True) File "/home/ttrouill/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/downhill/init.py", line 91, in minimize ).minimize(train, valid, _kwargs) File "/home/ttrouill/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/downhill/base.py", line 443, in minimize for monitors in self.iterate(_args, _kwargs): File "/home/ttrouill/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/downhill/base.py", line 409, in iterate validation = self.evaluate(valid) File "/home/ttrouill/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/downhill/base.py", line 249, in evaluate values = [self.f_eval(_x) for x in dataset] File "/home/ttrouill/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/compile/function_module.py", line 513, in call allow_downcast=s.allow_downcast) File "/home/ttrouill/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/tensor/type.py", line 78, in filter 'Expected an array-like object, but found a Variable: ' TypeError: ('Bad input argument to theano function with name "evaluation" at index 0(0-based)', 'Expected an array-like object, but found a Variable: maybe you are trying to call a function on a (possibly shared) variable instead of a numeric array?')