lmjohns3 / downhill

Stochastic gradient routines for Theano
MIT License
102 stars 24 forks source link

making the training data shared to speedup the training #5

Closed devdoer closed 9 years ago

devdoer commented 9 years ago


Using the shared function to make sure that the input dataset stored on the graphics device will double the training speed. Can downhill support this feature?

lmjohns3 commented 9 years ago

I can't quite follow what you mean here. Can you provide more details?

devdoer commented 9 years ago


As the theano document suggests "When using the GPU, float32 tensor shared variables are stored on the GPU by default to eliminate transfer time for GPU ops using those variables." , if we store the the training data in the gpu, and get the batch from the gpu using index , the performance is better.

Like this:

 instance = np.fromfile('DIM54_part-00003_0.ctr_bin', dtype=np.float32)
 instance = instance.reshape((-1,54))
 feat_mat = instance[:,1:]
 label_vec = instance[:,0]

 tmp1 = numpy.zeros((10,10), dtype=theano.config.floatX)
 tmp2 = numpy.zeros((10,), dtype=theano.config.floatX) 
 shared_D0 = theano.shared(tmp1, name = 'x', borrow = True)
 shared_D1 = theano.shared(tmp2, name = 'y', borrow = True)
 shared_D0.set_value(feat_mat, borrow=True)
 shared_D1.set_value(label_vec.astype(numpy.float32), borrow=True)

 train = theano.function(
             outputs=[prediction, xent],
             updates={w:w-0.01*gw, b:b-0.01*gb},
             givens={x:shared_D0[index*batch_size:(1+index)*batch_size], y:shared_D1[index*batch_size:(1+index)*batch_size]},
             name = "train")
training_steps = instance.shape[0]/batch_size
for i in range(training_steps):
     pred, err = train2(feat_mat[i*batch_size:(i+1)*batch_size],label_vec[i*batch_size:(i+1)*batch_size])
lmjohns3 commented 9 years ago

Ah, thanks, I understand!

The short answer is, if this isn't already supported, then it should be.

One thing that might work (but I haven't tried it myself) is to use a callable when providing your training data. Something like this should get the point across:

import numpy as np
import theano
import downhill

class Batches(object):
    def __init__(self):
        self.length = 1000
        self.data = theano.shared(np.random.randn(self.length, 5).astype('f'))
        self.index = 0
        self.batch_size = 64

    def __call__(self):
        if self.index >= self.length:
            self.index = 0
            return self.data[self.index:self.index + self.batch_size]
            self.index += self.batch_size

downhill.minimize(loss, train=Batches())

If something like this does work, it could be nice to integrate this more explicitly in downhill so that if you pass a theano.shared array as training data, then downhill would basically construct a callable like this internally and handle the batching for you.

devdoer commented 9 years ago

Thanks. Hoping it to be supported.

lmjohns3 commented 9 years ago

Just added some code to support shared variables as Dataset inputs. This will be included in the next release, probably in the next few weeks.

ttrouill commented 8 years ago

I just tried on downhill 0.3.1 by modifying the matrix factorization example, replacing the line: train=[y], by: train=[theano.shared(y)],

And had the following error:

Traceback (most recent call last): File "original.py", line 32, in monitor_gradients=True) File "/home/ttrouill/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/downhill/init.py", line 91, in minimize ).minimize(train, valid, _kwargs) File "/home/ttrouill/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/downhill/base.py", line 443, in minimize for monitors in self.iterate(_args, _kwargs): File "/home/ttrouill/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/downhill/base.py", line 409, in iterate validation = self.evaluate(valid) File "/home/ttrouill/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/downhill/base.py", line 249, in evaluate values = [self.f_eval(_x) for x in dataset] File "/home/ttrouill/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/compile/function_module.py", line 513, in call allow_downcast=s.allow_downcast) File "/home/ttrouill/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/tensor/type.py", line 78, in filter 'Expected an array-like object, but found a Variable: ' TypeError: ('Bad input argument to theano function with name "evaluation" at index 0(0-based)', 'Expected an array-like object, but found a Variable: maybe you are trying to call a function on a (possibly shared) variable instead of a numeric array?')