mila-iqia / blocks

A Theano framework for building and training neural networks
Other
1.16k stars 351 forks source link

Convenient way to monitor many variables with the same names #1015

Open Beronx86 opened 8 years ago

Beronx86 commented 8 years ago

My program raise NaN error after several iterations. I want to monitor the automatically added auxiliary variables (W_norm and b_norm). I use TrainingDataMonitoring(fg.auxiliary_variables, after_batch=True). But the program raise ValueError: variables should have different names! Duplicates: W_norm, b_norm. My model is a multiple layer neural network. So, there are duplicate auxiliary variable names. What should I do to monitor these variables? Besides how to monitor the gradient?

dwf commented 8 years ago

That's unfortunate. If you have unique brick names, something like ['{}:{}'.format(get_brick(v).name, v.name) for v in fg.auxiliary_variables] using get_brick from blocks.filter should work. But this is something we should probably provide a utility for.

Beronx86 commented 8 years ago

@dwf HI, thanks for your response. The brick names are not unique. But enlighted by you, I modified the Selector code to select auxiliary variables, and rename the variable to path+variable_name. Finally I can montor the auxiliary variables. How can I montor the gradients?

Beronx86 commented 8 years ago

I found GradientDescent.total_gradient_norm and GradientDescent.total_step_norm can be used to monitor the gradient.

dwf commented 8 years ago

@rizar Seems like we should keep this issue open or start a new one about the fact that monitoring auxiliary variables that have the same names from different bricks is unnecessarily complicated.

rizar commented 8 years ago

I don't think there exists a complete solution to this problem. One remedy I can think of is to have a function hierarchical_rename(top_bricks, variables) that would change the names of the variables to contain the pathes from top bricks. This could be a method of Selector.

I will rename and reopen the issue.