tensorflow / fold

Deep learning with dynamic computation graphs in TensorFlow
Apache License 2.0
1.82k stars 266 forks source link

Is it possible to store the output at every node of a tree-structured neural network? #49

Open yizhongw opened 7 years ago

yizhongw commented 7 years ago

Hi, I'm trying to implement the attention mechanism on tree-structured neural networks, such as TreeLSTM or TreeGRU. Since we want to attend to the most informative nodes in the tree, we need to store the output (usually the hidden state) at each node first. I know that in Tensorflow, the dynamic_rnn function could return all the outputs besides the final state. Is it possible to achieve this in Tensorflow-Fold? Thanks!

delesley commented 7 years ago

You can use a metric to collect information from every node. You can't use the metric as an input to a block, but you can access it as an output tensor; the number of rows will correspond to the number of nodes in the tree(s).

Alternately, you could use Loom directly, rather than using the Blocks library, at which point you have full control.

On Thu, Apr 20, 2017 at 1:29 AM, Yizhong Wang notifications@github.com wrote:

Hi, I'm trying to implement the attention mechanism https://arxiv.org/abs/1701.01811 on tree-structured neural networks, such as TreeLSTM or TreeGRU. Since we want to attend to the most informative nodes in the tree, we need to store the output (usually the hidden state) at each node first. I know that in Tensorflow, the dynamic_rnn function could return all the outputs besides the final state. Is it possible to achieve this in Tensorflow-Fold? Thanks!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tensorflow/fold/issues/49, or mute the thread https://github.com/notifications/unsubscribe-auth/AGGbTYPz_7nwHV1tOYZNkbdH20vnj9xIks5rxxdggaJpZM4NCs_R .

-- DeLesley Hutchins | Software Engineer | delesley@google.com | 505-206-0315

mbosnjak commented 7 years ago

But a metric concatenates output vectors from nodes, without noting from which node nor which tree in the batch the value came from. Sure, a metric has an option to pass a PyObjectType() object as a second argument, but that cannot be used on the top to form the aggregate loss on tree level, and then over a batch, or am I wrong here? I assume we can modify instances of a batch on the fly to include an ID of a tree, propagated through each node, collect that with a metric, and then use the output vector of that metric with tf.dynamic_partition to achieve such aggregation,, but that requires a fixed batch size, no? (tf.dynamic_partition outputs a fixed python list of tensors) Is there a simpler (or call it more elegant) way this could be done?

delesley commented 7 years ago

Metrics are really intended for things like loss or perplexity, where you can sum all the outputs over a batch, without worrying about which node in the tree the value is coming from.

If you really need to collect intermediate outputs on a node-by-node basis, then you may want to drop down to the Loom library, which lets you traverse the tree yourself and have fine-grained control over what happens at each node.

On Thu, May 18, 2017 at 6:55 AM, Matko Bosnjak notifications@github.com wrote:

But a metric concatenates output vectors from nodes, without noting from which node nor which tree in the batch the value came from. Sure, a metric has an option to pass a PyObjectType() object as a second argument, but that cannot be used on the top to form the aggregate loss on tree level, and then over a batch, or am I wrong here? I assume we can modify instances of a batch on the fly to include an ID of a tree, propagated through each node, collect that with a metric, and then use the output vector of that metric with tf.dynamic_partition to achieve such aggregation,, but that requires a fixed batch size, no (tf.dynamic_partition outputs a fixed python list of tensors)? Is there a simpler (or call it more elegant) way this could be done?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tensorflow/fold/issues/49#issuecomment-302410810, or mute the thread https://github.com/notifications/unsubscribe-auth/AGGbTRYOAJibXrgrjGfBBDCAU3H1HXiDks5r7E3JgaJpZM4NCs_R .

-- DeLesley Hutchins | Software Engineer | delesley@google.com | 505-206-0315

fred2008 commented 6 years ago

could it be solved the by concentrate two subtree node represent recuisively?