joschu / cgt

Computation Graph Toolkit
Other
628 stars 87 forks source link

Add support for theano's tensor.inc_subtensor command #22

Closed avostryakov closed 8 years ago

avostryakov commented 9 years ago

In theano there is a great command: http://deeplearning.net/software/theano/library/tensor/basic.html#theano.tensor.inc_subtensor

I can use it following way: embedding_grads = theano.grad(cost, embedding_output) updates[embedding.W] = T.inc_subtensor(embedding.W[T.reshape(input_var, (N_BATCH * MAX_LENGTH, ))], -LEARNING_RATE * T.reshape(embedding_grads, (N_BATCH*MAX_LENGTH, 300)))

It helps to train only embedding word vectors that exist in current mini-batch.

joschu commented 8 years ago

Hi Magic, I've implemented inc_subtensor in 7d30be8. The syntax is a little different than python, you have to write inc_subtensor(x, slices, y) which implements x[slices] += y See cgt/tests/test_inc_subtensor.py for an example of how to use inc_subtensor with three different types of indexing.

Your example also required a type of indexing that previously wasn't implemented, in which you need to index using an integer array along one dimension. I implemented this type of indexing in a43259d.

Let me know if these changes do what you need.

f0k commented 8 years ago

Cool, but what does your inc_subtensor return then? The docstring is a bit scarce... Theano's inc_subtensor(x[slice], y) returns an expression for x with x[slice] incremented by y. That's required so you can use it in an update dictionary as in @avostryakov's example. inc_subtensor(x, slice, y) would be fine with respect to the syntax (I even find it easier to understand), but it would still need to return something that represents the full x with a slice of it changed. Is that what it does?

joschu commented 8 years ago

Indeed, it returns a tensor variable where the slice has been incremented. I improved the docstring.


def inc_subtensor(x, slis, y):
    """
    Returns the array that is obtained by incrementing x[slis] by y
    This function corresponds to the following numpy code:
        out = x.copy()
        out[slis] += y
        return out
    Note that due to an in-place optimization, the copy operation is
    usually not performed.

    See subtensor docstring for a list of appropriate formats for `slis`
    Only formats 2-4 are allowed for inc_subtensor
    """
    See subtensor docstring for a list of appropriate formats for `slis`
    Only formats 2-4 are allowed for inc_subtensor
f0k commented 8 years ago

:+1:

joschu commented 8 years ago

I'll close this issue, since I think it's resolved.

avostryakov commented 8 years ago

Sorry, for a late response. I was busy and installation of cgt is not trivial now :) But I installed and checked a new inc_subtensor. It works how it is expected. Thank you!

But when I was doing it I discover that not all necessary matrix indexing is supporting. I'll create new issues.

myexceptions commented 8 years ago

idx=theano.tensor.ivector() word_embedding=.... #a float matrix, theano shared variable subset = word_embedding[idx]

g = T.grad(cost, subset) updates[word_embedding] = T.inc_subtensor(x, g *lr)


I have a question: does 'idx' need to be a vector? what if idx is a matrix? for example use mini batch to train, every training case has m words, and every batch has n cases, then 'idx ' should be a matrix of n * m.

and i gotta error when i use 'idx' as a matrix: `ValueError: array is not broadcastable to correct shape Apply node that caused the error: AdvancedIncSubtensor1{no_inplace,inc}(<TensorType(float64, matrix)>, Reshape{2}.0, Reshape{1}.0) Toposort index: 18 Inputs types: [TensorType(float64, matrix), TensorType(float64, matrix), TensorType(int32, vector)] Inputs shapes: [(40000, 50), (2, 100), (4,)] Inputs strides: [(400, 8), (800, 8), (4,)] Inputs values: ['not shown', 'not shown', array([1, 2, 0, 3], dtype=int32)] Outputs clients: [['output']]

Backtrace when the node is created(use Theano flag traceback.limit=N to make it longer): File "/usr/local/anaconda2/lib/python2.7/site-packages/traitlets/config/application.py", line 596, in launch_instance app.start() File "/usr/local/anaconda2/lib/python2.7/site-packages/IPython/terminal/ipapp.py", line 345, in start self.shell.mainloop() File "/usr/local/anaconda2/lib/python2.7/site-packages/IPython/terminal/interactiveshell.py", line 548, in mainloop self.interact(display_banner=display_banner) File "/usr/local/anaconda2/lib/python2.7/site-packages/IPython/terminal/interactiveshell.py", line 672, in interact self.run_cell(source_raw, store_history=True) File "/usr/local/anaconda2/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2723, in run_cell interactivity=interactivity, compiler=compiler, result=result) File "/usr/local/anaconda2/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2825, in run_ast_nodes if self.run_code(code, result): File "/usr/local/anaconda2/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2885, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 1, in upda = T.inc_subtensor(x, gg)

`

f0k commented 8 years ago

@myexceptions: Sorry, you're in the wrong place here... this is the Issue tracker of CGT, a possible alternative library to Theano. Try on theano-users. And include your definition of x when doing so, otherwise it's not clear what you were doing.