distillpub / post--building-blocks

The Building Blocks of Interpretability
https://distill.pub/2018/building-blocks
Creative Commons Attribution 4.0 International
87 stars 27 forks source link

Error while using a different model than inception_v1 in your notebook, unable to evaluate gradients (inception_v4) #32

Closed osoffer closed 4 years ago

osoffer commented 4 years ago

Hi,

I'm trying to generalize your code in the Channel Attribution - Building Blocks of Interpretability notebook, in order for it to work on all state-of-the-art CNN models.

I've used @ludwigschubert 's guide suggested here https://github.com/tensorflow/lucid/issues/34 for converting prepared models to the modelzoo format, and successfully converted inception_v4 & nasnet_large for now. I get prepared models here: https://github.com/tensorflow/models/tree/master/research/slim

I've used modelzoo files in order to create visualizations of channels in each network, and created spritemaps for each layer: https://github.com/osoffer/Diamond-Cutter/tree/master/models/inception_v4/spritemaps

When I'm trying to load the modelzoo file for inception_v4 and use it in Channel Attribution - Building Blocks, an error occurs in the channel_attr_simple method. It's not clear what layer I need to choose to replace "softmax2_pre_activation", set in your example for inception_v1 (line: logit = T("softmax2_pre_activation")[0])

I've tried all inception_v4 layers coming after the last convolutional layer, non of them worked. When trying to calculate gradients, I get an error. (I can add a link for a google colab notebook that reproduces my error )

for choosing layer InceptionV4/Logits/AvgPool_1a/AvgPool, the error is: InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: Computed output size would be negative: -2 [input_size: 5, effective_filter_size: 8, stride: 1] [[node import/InceptionV4/Logits/AvgPool_1a/AvgPool (defined at /tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/ops.py:1748) ]] [[gradients/AddN_5/_3]] (1) Invalid argument: Computed output size would be negative: -2 [input_size: 5, effective_filter_size: 8, stride: 1] [[node import/InceptionV4/Logits/AvgPool_1a/AvgPool (defined at /tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/ops.py:1748) ]] 0 successful operations. 0 derived errors ignored.

for choosing layer InceptionV4/Logits/Predictions, the error is: (0) Invalid argument: Computed output size would be negative: -2 [input_size: 5, effective_filter_size: 8, stride: 1] [[node import/InceptionV4/Logits/AvgPool_1a/AvgPool (defined at /tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/ops.py:1748) ]] [[gradients/AddN_5/_3]] (1) Invalid argument: Computed output size would be negative: -2 [input_size: 5, effective_filter_size: 8, stride: 1] [[node import/InceptionV4/Logits/AvgPool_1a/AvgPool (defined at /tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/ops.py:1748) ]] 0 successful operations. 0 derived errors ignored.

the error occurs at line: grad = t_grad.eval()

I use this modelzoo file, for inception_v4: https://drive.google.com/uc?id=15CmJ4UbUm8MXp8h0uHwbEe0n8rPrxARp

full error trace:

InvalidArgumentErrorTraceback (most recent call last)

in () 3 model_layer = {"inception_v4" : "InceptionV4/InceptionV4/Mixed_7b/concat"} 4 class1 = "Labrador retriever" ----> 5 channel_attr_wrapper(img_s, model_layer, class1, n_show=10) 8 frames in channel_attr_wrapper(img_s, model_layer, class1, n_show) 7 last_layer = models[model_name]["last_layer"] 8 channel_attr_simple(selected_model, model_name, model_layer[model_name], ----> 9 last_layer, img_s, n_show, class1) in channel_attr_simple(selected_model, model_name, layer_name, last_layer, img_s, n_show, class1) 1 def channel_attr_simple(selected_model, model_name, layer_name, last_layer, img_s, n_show, class1): 2 # calc model activations ----> 3 channel_attr = channel_attr_simple_org_core(img_s, layer_name, last_layer, class1, selected_model) 4 channel_attr = channel_attr / len(img_s) 5 in channel_attr_simple_org_core(img_s, layer, last_layer, class1, selected_model) 45 # print(type(t_grad)) 46 #print("t_grad shape " + str(t_grad.shape)) ---> 47 grad = t_grad.eval() 48 print("grad") 49 print(grad) /tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/ops.pyc in eval(self, feed_dict, session) 796 797 """ --> 798 return _eval_using_default_session(self, feed_dict, self.graph, session) 799 800 def experimental_ref(self): /tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/ops.pyc in _eval_using_default_session(tensors, feed_dict, graph, session) 5405 "the tensor's graph is different from the session's " 5406 "graph.") -> 5407 return session.run(tensors, feed_dict) 5408 5409 /tensorflow-1.15.2/python2.7/tensorflow_core/python/client/session.pyc in run(self, fetches, feed_dict, options, run_metadata) 954 try: 955 result = self._run(None, fetches, feed_dict, options_ptr, --> 956 run_metadata_ptr) 957 if run_metadata: 958 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr) /tensorflow-1.15.2/python2.7/tensorflow_core/python/client/session.pyc in _run(self, handle, fetches, feed_dict, options, run_metadata) 1178 if final_fetches or final_targets or (handle and feed_dict_tensor): 1179 results = self._do_run(handle, final_targets, final_fetches, -> 1180 feed_dict_tensor, options, run_metadata) 1181 else: 1182 results = [] /tensorflow-1.15.2/python2.7/tensorflow_core/python/client/session.pyc in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata) 1357 if handle is None: 1358 return self._do_call(_run_fn, feeds, fetches, targets, options, -> 1359 run_metadata) 1360 else: 1361 return self._do_call(_prun_fn, handle, feeds, fetches) /tensorflow-1.15.2/python2.7/tensorflow_core/python/client/session.pyc in _do_call(self, fn, *args) 1382 '\nsession_config.graph_options.rewrite_options.' 1383 'disable_meta_optimizer = True') -> 1384 raise type(e)(node_def, op, message) 1385 1386 def _extend_graph(self): InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: Computed output size would be negative: -2 [input_size: 5, effective_filter_size: 8, stride: 1] [[node import/InceptionV4/Logits/AvgPool_1a/AvgPool (defined at /tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/ops.py:1748) ]] [[gradients/AddN_5/_3]] (1) Invalid argument: Computed output size would be negative: -2 [input_size: 5, effective_filter_size: 8, stride: 1] [[node import/InceptionV4/Logits/AvgPool_1a/AvgPool (defined at /tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/ops.py:1748) ]] 0 successful operations. 0 derived errors ignored. Original stack trace for u'import/InceptionV4/Logits/AvgPool_1a/AvgPool': File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/usr/local/lib/python2.7/dist-packages/ipykernel_launcher.py", line 16, in app.launch_new_instance() File "/usr/local/lib/python2.7/dist-packages/traitlets/config/application.py", line 658, in launch_instance app.start() File "/usr/local/lib/python2.7/dist-packages/ipykernel/kernelapp.py", line 499, in start self.io_loop.start() File "/usr/local/lib/python2.7/dist-packages/tornado/ioloop.py", line 888, in start handler_func(fd_obj, events) File "/usr/local/lib/python2.7/dist-packages/tornado/stack_context.py", line 277, in null_wrapper return fn(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/zmq/eventloop/zmqstream.py", line 456, in _handle_events self._handle_recv() File "/usr/local/lib/python2.7/dist-packages/zmq/eventloop/zmqstream.py", line 486, in _handle_recv self._run_callback(callback, msg) File "/usr/local/lib/python2.7/dist-packages/zmq/eventloop/zmqstream.py", line 438, in _run_callback callback(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/tornado/stack_context.py", line 277, in null_wrapper return fn(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/ipykernel/kernelbase.py", line 283, in dispatcher return self.dispatch_shell(stream, msg) File "/usr/local/lib/python2.7/dist-packages/ipykernel/kernelbase.py", line 233, in dispatch_shell handler(stream, idents, msg) File "/usr/local/lib/python2.7/dist-packages/ipykernel/kernelbase.py", line 399, in execute_request user_expressions, allow_stdin) File "/usr/local/lib/python2.7/dist-packages/ipykernel/ipkernel.py", line 208, in do_execute res = shell.run_cell(code, store_history=store_history, silent=silent) File "/usr/local/lib/python2.7/dist-packages/ipykernel/zmqshell.py", line 537, in run_cell return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/IPython/core/interactiveshell.py", line 2718, in run_cell interactivity=interactivity, compiler=compiler, result=result) File "/usr/local/lib/python2.7/dist-packages/IPython/core/interactiveshell.py", line 2828, in run_ast_nodes if self.run_code(code, result): File "/usr/local/lib/python2.7/dist-packages/IPython/core/interactiveshell.py", line 2882, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 5, in channel_attr_wrapper(img_s, model_layer, class1, n_show=10) File "", line 9, in channel_attr_wrapper last_layer, img_s, n_show, class1) File "", line 3, in channel_attr_simple channel_attr = channel_attr_simple_org_core(img_s, layer_name, last_layer, class1, selected_model) File "", line 7, in channel_attr_simple_org_core T = render.import_model(selected_model, t_input, t_input) File "/usr/local/lib/python2.7/dist-packages/lucid/optvis/render.py", line 234, in import_model model.import_graph(t_image, scope="import", forget_xy_shape=True) File "/usr/local/lib/python2.7/dist-packages/lucid/modelzoo/vision_base.py", line 62, in import_graph self.graph_def, {self.input_name: t_prep_input}, name=scope) File "/tensorflow-1.15.2/python2.7/tensorflow_core/python/util/deprecation.py", line 507, in new_func return func(*args, **kwargs) File "/tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/importer.py", line 405, in import_graph_def producer_op_list=producer_op_list) File "/tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/importer.py", line 517, in _import_graph_def_internal _ProcessNewOps(graph) File "/tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/importer.py", line 243, in _ProcessNewOps for new_op in graph._add_new_tf_operations(compute_devices=False): # pylint: disable=protected-access File "/tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/ops.py", line 3561, in _add_new_tf_operations for c_op in c_api_util.new_tf_operations(self) File "/tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/ops.py", line 3451, in _create_op_from_tf_operation ret = Operation(c_op, self) File "/tensorflow-1.15.2/python2.7/tensorflow_core/python/framework/ops.py", line 1748, in __init__ self._traceback = tf_stack.extract_stack() My goal is to have all state-of-the-art models for CNN available for use in your Channel Attribution tool, and adding more features for the task of choosing a model for transfer learning. I would really appreciate your help. Thanks!
osoffer commented 4 years ago

The problem was the size of the input, didn't know it varies in different networks.