tensorflow / fold

Deep learning with dynamic computation graphs in TensorFlow
Apache License 2.0
1.82k stars 266 forks source link

How to print intermediate outputs in recursion in TF Fold? #95

Closed hockeybro12 closed 6 years ago

hockeybro12 commented 6 years ago

Hello,

I am trying to do some recursive computation using TF Fold, and there is an error which I want to debug.

Here is some code:

expr_decl_1 = td.ForwardDeclaration(td.PyObjectType(), td.TensorType([128]))
expr_decl_2 = td.ForwardDeclaration(td.PyObjectType(), td.TensorType([50]))

case_module1 = td.Record([('time_idx', td.Scalar('int32')),
                                            ('batch_idx', td.Scalar('int32'))])
case_module1 = case_module1 >> td.Function(Module1)
case_module2 = td.Record([('input_0', expr_decl_2()),
                                  ('time_idx', td.Scalar('int32')),
                                  ('batch_idx', td.Scalar('int32'))])
case_module2 = case_module2 >> td.Function(Module2)
case_module3 = td.Record([('input_0', expr_decl_1()),
                                  ('time_idx', td.Scalar('int32')),
                                  ('batch_idx', td.Scalar('int32'))])
case_module3 = case_module3 >> td.Function(Module3)
case_output = td.Record([('input_0', expr_decl_1()),
                                  ('time_idx', td.Scalar('int32')),
                                  ('batch_idx', td.Scalar('int32'))])
case_output = case_output >> td.Function(OutputModule)

recursion_case_1 = td.OneOf(td.GetItem('module'), { '_module1': case_module1})
expr_decl2().resolve_to(recursion_case_1)
recursion_case2 = td.OneOf(td.GetItem('module'), {'_module2': case_module2, '_module3': case_module3})
expr_decl_1().resolve_to(recursion_case_2)

output_scores = td.OneOf(td.GetItem('module'), {
                '_Output': case_output,
                INVALID_EXPR: dummy_scores})
self.compiler = td.Compiler.create(output_scores)
self.output = self.compiler.output_tensors[0]

I was wondering how I could print the intermediate outputs, which come from Module1, Module2, Module3, and OutputModule? This way I can see which module (series of NN layers) is causing the network to predict 0.

Here is an explanation of what this code is doing: Basically, what I have is 3 separate modules (referred to as Module1, Module2, Module3). The first one takes no input, and outputs a size [batch_size, 50] tensor. This is given as input to module 2, which produces a size [batch_size, 128] tensor. That is given as input to module3. These modules are to be assembled using a dictionary that is passed in.

Does anyone have an idea how to print these #values?

Edit: For more context, I am trying to build a Neural Modular Network. Most of the code is similar to that described here: https://github.com/ronghanghu/n2nmn/blob/master/models_clevr/nmn3_model.py. However, the key difference in my case is that I am trying to use different shapes of inputs.

delesley commented 6 years ago

It's not a problem with the recursion, it's a problem with OneOf. OneOf will conditionally switch between several different cases at runtime, much like a C switch statement. In order for this to be type safe, each of those cases much have the same type. You haven't shown the code for Module2 and Module3, but it looks like the former returns a [batch_size, 128] tensor, while the latter returns a Tuple.

On Wed, Apr 18, 2018 at 10:08 AM, Nikhil Mehta notifications@github.com wrote:

Hello,

I am trying to do some recursive computation using TF Fold.

Here is some code:

expr_decl_1 = td.ForwardDeclaration(td.PyObjectType(), td.TensorType([128])) expr_decl_2 = td.ForwardDeclaration(td.PyObjectType(), td.TensorType([50]))

case_module1 = td.Record([('time_idx', td.Scalar('int32')), ('batch_idx', td.Scalar('int32'))]) case_module1 = case_module1 >> td.Function(Module1) case_module2 = td.Record([('input_0', expr_decl_2()), ('time_idx', td.Scalar('int32')), ('batch_idx', td.Scalar('int32'))]) case_module2 = case_module2 >> td.Function(Module2) case_module3 = td.Record([('input_0', expr_decl_1()), ('time_idx', td.Scalar('int32')), ('batch_idx', td.Scalar('int32'))]) case_module3 = case_module3 >> td.Function(Module3)

recursion_case_1 = td.OneOf(td.GetItem('module'), { '_module1': case_module1}) expr_decl2().resolve_to(recursion_case_1) recursion_case2 = td.OneOf(td.GetItem('module'), {'_module2': case_module2, '_module3': case_module3}) expr_decl_1().resolve_to(recursion_case_2)

This gives an error when compiling: TypeError: Type mismatch between output type TensorType((128,), 'float32') and expected output type TupleType(TensorType((50,), 'float32'), TensorType((), 'int32'), TensorType((), 'int32')) in . It is at the line: expr_decl_1().resolve_to(recursion_case_2)

Basically, what I have is 3 separate modules (referred to as Module1, Module2, Module3). The first one takes no input, and outputs a size [batch_size, 50] tensor. This is given as input to module 2, which produces a size [batch_size, 128] tensor. That is given as input to module3. These modules are to be assembled using a dictionary that is passed in. I'm not sure why there is a problem with resolving this, I must not understand recursion in TF Fold properly.

Does anyone have an idea how to fix this, so that I can do what I described above without getting this error?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tensorflow/fold/issues/95, or mute the thread https://github.com/notifications/unsubscribe-auth/AGGbTdvSkDffhlWC_bVXB2qmEJf35AHSks5tp3MigaJpZM4TacNF .

-- DeLesley Hutchins | Software Engineer | delesley@google.com | 505-206-0315

hockeybro12 commented 6 years ago

Thanks for the response.

The issue was that I had forgotten to include this line: case_module2 = case_module2 >> td.Function(Module2)

My modules all return tensors (not tuples), it's just that Module1 returns a [batch_size, 50] tensor, which is a different size than what Module2 returns. This is why I had to use expr_decl_1 for that and expr_decl_2 for Module2. The code posted above runs, as I think it should (Let me know if you disagree).

However, I was now wondering how I could print the outputs of each Module? In normal TF, I can just run the tensor in my session and get the output that way, but I'm not sure what Tensor I should print in this case, because TF Fold is deciding which modules to run and how to run them based on my input.

delesley commented 6 years ago

You can put a tf.Print into a td.Function; tensorflow will then print it when it is evaluated.

On Thu, Apr 19, 2018 at 12:07 PM, Nikhil Mehta notifications@github.com wrote:

Thanks for the response.

The issue was that I had forgotten to include this line: case_module2 = case_module2 >> td.Function(Module2)

My modules all return tensors (not tuples), it's just that Module1 returns a [batch_size, 50] tensor, which is a different size than what Module2 returns. This is why I had to use expr_decl_1 for that and expr_decl_2 for Module2. The code posted above runs, as I think it should (Let me know if you disagree).

However, I was now wondering how I could print the outputs of each Module? In normal TF, I can just run the tensor in my session and get the output that way, but I'm not sure what Tensor I should print in this case, because TF Fold is deciding which modules to run and how to run them based on my input.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tensorflow/fold/issues/95#issuecomment-382847643, or mute the thread https://github.com/notifications/unsubscribe-auth/AGGbTRXAjlzqEXVfO4w3-1pEupWHsiUKks5tqOB1gaJpZM4TacNF .

-- DeLesley Hutchins | Software Engineer | delesley@google.com | 505-206-0315

hockeybro12 commented 6 years ago

Thanks, I was able to solve that.

I have another related problem now. Let's say I wanted to have some textual input to each of the modules. I will then process this input in each module using an LSTM. I can run the LSTM like so inside each module:

rnn_inputs = tf.nn.embedding_lookup(self.embeddings, text_input)
outputs, rnn_state = tf.nn.dynamic_rnn(cell=self.lstm, inputs=rnn_inputs, sequence_length=text_input_lengths, dtype=tf.float32, time_major=False)

As you can see, in this case, I have two inputs needed, the text_input and the text_input_lengths. text_input_lengths is a tensor of size [batch_size].

I try and create this input for each module using a method similar to what is described in my original post above:

case_module2 = td.Record([('input_0', expr_decl_2()),
                                  ('time_idx', td.Scalar('int32')),
                                  ('batch_idx', td.Scalar('int32')),
                                  ('text_input', td.Vector((maxtextlength), 'int32'))
                                  ('text_input_lengths', td.Vector((0), 'int32')),])
case_module2 = case_module2 >> td.Function(Module2)

The problem is that Tensorflow now complains saying that the RNN function expected an input of batch_size, where this input is actually [batch_size, 0] for text_input_lengths. It doesn't let me put None or -1 as the shape either. Fold adds the batch_size dimension, but the only way the RNN is going to work is if the lengths tensor has the shape [batch_size]. Is there another input format I can use to get this to work?

Is there any way I can avoid putting in this extra dimension in the vector input so that I can have the right size in my module?