tensorflow / fold

Deep learning with dynamic computation graphs in TensorFlow
Apache License 2.0
1.83k stars 266 forks source link

Convolution over sequence #15

Open MadcowD opened 7 years ago

MadcowD commented 7 years ago

Hi,

I'm trying to convolve a filter over a sequence block type and I'm not sure what the cannonical methodology is.

Thanks

fabioasdias commented 7 years ago

I didn't really understand your problem. Do you want to do a convolution on each element of the sequence separately OR on the sequence itself as a whole? What have you tried? (code snippets are always helpful)

MadcowD commented 7 years ago

Apologies: I am trying to do the following. Given a variable length sequence of vectors [v_1, \dots, v_n] where n varies in the batch and v_i is d-dimensional, I would like to treat the sequence itself as an image with dimensions n x d.

Then I would like to apply a convolutional layer to this image. I'm implementing the TCNN model of http://www.aclweb.org/anthology/P14-1062

fabioasdias commented 7 years ago

Just for clarification, lets assume that you can do that (I'm not sure how yet, it is beyond my current abilities - maybe concatenating the sequence to generate a td.Tensor((n,d)) then convolution as I did in issue #14 ).

Then, for each example, the number of outputs of the convolution would be different (because they depend on the size (n,d) of the input image). How would you feed this into the next layer, in your model?

mherreshoff commented 7 years ago

That solution won't work for William, because he wants n to vary by input. However, the set of operations and type-shapes supported by loom (the dynamic graph implementation) is set at graph-construction time, so this won't work. To put it another way, you wouldn't have an n to pass in when building the fold blocks.

However, there is a workaround. Let's say you wanted to do a 3x3 convolution on your n by d matrix, but this matrix is represented by a fold block that emits a sequence of d-vectors. You could take your sequence, and pass it to td.NGrams(3) in order to get all consecutive triples of rows. Then you stack the triples into 3xd matrices and run your convolution. I'm guessing it would look something like this:

def _convolve_three(x, y, z): return convolution_goes_here(tf.stack([x, y, z], axis=2))

sequence_block_goes_here >> td.NGrams(3) >> td.Map(td.Function(_convolve_three))

Warning: this workaround will create extra copies of your data. This could be problematic if your implementation is memory bound.

On Thu, Feb 23, 2017 at 11:03 PM, Fábio Dias notifications@github.com wrote:

Just for clarification, lets assume that you can do that (I'm not sure how yet, it is beyond my current abilities - maybe concatenating the sequence to generate a td.Tensor((n,d)) then convolution as I did in issue #14 https://github.com/tensorflow/fold/issues/14 ).

Then, for each example, the number of outputs of the convolution would be different (because they depend on the size (n,d) of the input image). How would you feed this into the next layer, in your model?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tensorflow/fold/issues/15#issuecomment-282220017, or mute the thread https://github.com/notifications/unsubscribe-auth/AABepNnGtYK2PoKIlK8Jyy2V_jAMDytaks5rfoC6gaJpZM4MK4Ui .