Closed devdoer closed 9 years ago
I can't quite follow what you mean here. Can you provide more details?
HI:
As the theano document suggests "When using the GPU, float32 tensor shared variables are stored on the GPU by default to eliminate transfer time for GPU ops using those variables." , if we store the the training data in the gpu, and get the batch from the gpu using index , the performance is better.
Like this:
instance = np.fromfile('DIM54_part-00003_0.ctr_bin', dtype=np.float32)
instance = instance.reshape((-1,54))
feat_mat = instance[:,1:]
label_vec = instance[:,0]
tmp1 = numpy.zeros((10,10), dtype=theano.config.floatX)
tmp2 = numpy.zeros((10,), dtype=theano.config.floatX)
shared_D0 = theano.shared(tmp1, name = 'x', borrow = True)
shared_D1 = theano.shared(tmp2, name = 'y', borrow = True)
shared_D0.set_value(feat_mat, borrow=True)
shared_D1.set_value(label_vec.astype(numpy.float32), borrow=True)
train = theano.function(
inputs=[index],
outputs=[prediction, xent],
updates={w:w-0.01*gw, b:b-0.01*gb},
givens={x:shared_D0[index*batch_size:(1+index)*batch_size], y:shared_D1[index*batch_size:(1+index)*batch_size]},
name = "train")
training_steps = instance.shape[0]/batch_size
for i in range(training_steps):
pred, err = train2(feat_mat[i*batch_size:(i+1)*batch_size],label_vec[i*batch_size:(i+1)*batch_size])
Ah, thanks, I understand!
The short answer is, if this isn't already supported, then it should be.
One thing that might work (but I haven't tried it myself) is to use a callable when providing your training data. Something like this should get the point across:
import numpy as np
import theano
import downhill
class Batches(object):
def __init__(self):
self.length = 1000
self.data = theano.shared(np.random.randn(self.length, 5).astype('f'))
self.index = 0
self.batch_size = 64
def __call__(self):
if self.index >= self.length:
self.index = 0
try:
return self.data[self.index:self.index + self.batch_size]
finally:
self.index += self.batch_size
downhill.minimize(loss, train=Batches())
If something like this does work, it could be nice to integrate this more explicitly in downhill
so that if you pass a theano.shared
array as training data, then downhill
would basically construct a callable like this internally and handle the batching for you.
Thanks. Hoping it to be supported.
Just added some code to support shared variables as Dataset inputs. This will be included in the next release, probably in the next few weeks.
I just tried on downhill 0.3.1 by modifying the matrix factorization example, replacing the line: train=[y], by: train=[theano.shared(y)],
And had the following error:
Traceback (most recent call last):
File "original.py", line 32, in
HI:
Using the shared function to make sure that the input dataset stored on the graphics device will double the training speed. Can downhill support this feature?