joschu / cgt

Computation Graph Toolkit
Other
628 stars 87 forks source link

Looping over tensor / using symbolic variable. #23

Open elanmart opened 8 years ago

elanmart commented 8 years ago

Hello, if I'm understanding the code correctly, currently there is no way of doing loops with symbolic number of iterations (or looping over leading dimension of tensor like in ScanOP).

Are there plans to add such functionality to CGT? If not, what would be the recommended way of processing variable-length sequences? Is there a Switch operator, so that one could write:

output = init_output()
for t in range(max_num_steps):
   output = cgt.switch(X.shape[0]>t, make_step(X[t], output), output)

But wouldn't it create a huge overhead if the difference between shortest and longest sequences in the dataset is large?

Hope it makes sense to post this question here instead of the mailing list.

joschu commented 8 years ago

Right, this functionality currently isn't implemented. It'll certainly be possible to implement a Scan-like Op (in fact, I've implemented something similar before). But I think it'll take some thought to figure out what's the right way to do it here. Theano's scan doesn't have the friendliest syntax, and the Scan code is very intricate, suggesting that it might not be exactly the right abstraction.

Unfortunately, the switch method you suggested won't currently work--if you write a Switch Op, both of its operands will be evaluated.

I think the right solution to this problem is to have some mechanism for branching/looping that gets built into the "execution graph" -- the final data structure used to perform the computation.

The other temporary solution, which would work, is to make careful use of "masks", which zero out some components of the data or recurrent state and allow you train on variable-length inputs in batch form. AFAIK most high-performance RNN code uses this sort of scheme.

Edit: using switch would work (though there's overhead), you'd just have to clip the index so you don't get an out of bounds error. The folowing hacky solution almost works (one just needs to implement maximum

output = (X.shape[0]<=t) * make_step(X[cgt.maximum(t, X.shape[0]-1)], output) + (X.shape[0]>t) * output
elanmart commented 8 years ago

Thanks. is it possible to implement a lazy IfElse op in cgt?

joschu commented 8 years ago

Not yet, because of how the graph execution works. I'm going to open up an issue for further discussion on this topic -- @hojonathanho and I have discussed it a bit.