Open tisma opened 3 years ago
Hi @tisma,
Thanks for your report and especially for the code to reproduce the issue. We are looking into this.
Hi @wwwind, do you think you can take a look at this issue?
Hi @tisma, The function get_clusterable_weights
in your implementation does not return what is expected by the clustering algorithm.
It should be
def get_clusterable_weights(self):
return [('kernel', self.kernel)]
We have a tutorial for ClusterableLayer here.
Oh, I see the problem in my implementation of get_clusterable_weights
def get_clusterable_weights(self):
clusterable_weights = []
for weight in self.trainable_weights:
clusterable_weights.append((weight.name, weight.read_value()))
return clusterable_weights
First problem, return value of weight.read_value()
is of type <class 'tensorflow.python.framework.ops.EagerTensor'>
instead of <class 'tensorflow.python.ops.resource_variable_ops.ResourceVariable'>
so it should be replaced just by weight
in that tuple creation.
Second trickier problem is that weight.name
is not simply just the name of the variable, but it contains the layer name prefix and :0
at the end (eg. my_custom_layer/kernel:0
). So if I have to specify names manually or if I want to use more generic way for adding clusterable weights I'll have to do something like this
weight.name[weight.name.find("/") + 1 : weight.name.find(":")]
to get just variable name.
@wwwind This was just a simple example of the model. What if I have a more complex model which is composed of several layers? How can I add those weights that are part of the nested layers to the clustered_weights
list []?
This is the output of the model.summary()
and all the weights that are part of stereo_net
layer
# layers[9] -> StereoNet
for weight in model.layers[9].weights:
print(weight.name)
Model: "model"
________________________________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
========================================================================================================================
Left (InputLayer) [(None, None, None, 3)] 0
________________________________________________________________________________________________________________________
Right (InputLayer) [(None, None, None, 3)] 0
________________________________________________________________________________________________________________________
cam_fx (InputLayer) [(None, 1)] 0
________________________________________________________________________________________________________________________
cam_baseline (InputLayer) [(None, 1)] 0
________________________________________________________________________________________________________________________
cam_proj_l (InputLayer) [(None, 3, 4)] 0
________________________________________________________________________________________________________________________
cam_proj_r (InputLayer) [(None, 3, 4)] 0
________________________________________________________________________________________________________________________
targets (InputLayer) [(None, None, None)] 0
________________________________________________________________________________________________________________________
ious (InputLayer) [(None, None, None)] 0
________________________________________________________________________________________________________________________
labels_map (InputLayer) [(None, None, None)] 0
________________________________________________________________________________________________________________________
stereo_net (StereoNet) [(None, None, None), (2, 233472, 6), (2, 4085752 Left[0][0]
Right[0][0]
cam_fx[0][0]
cam_baseline[0][0]
cam_proj_l[0][0]
cam_proj_r[0][0]
targets[0][0]
ious[0][0]
labels_map[0][0]
________________________________________________________________________________________________________________________
depth (PassThrough) (None, None, None) 0 stereo_net[0][0]
________________________________________________________________________________________________________________________
bbox_cls (PassThrough) (2, 233472, 6) 0 stereo_net[0][1]
________________________________________________________________________________________________________________________
bbox_reg (PassThrough) (2, 10, 3, 6) 0 stereo_net[0][2]
________________________________________________________________________________________________________________________
bbox_centerness (PassThrough) (2, 233472, 6) 0 stereo_net[0][3]
========================================================================================================================
Total params: 4,085,752
Trainable params: 4,085,480
Non-trainable params: 272
________________________________________________________________________________________________________________________
conv2d/kernel:0
batch_normalization/gamma:0
batch_normalization/beta:0
conv2d_1/kernel:0
batch_normalization_1/gamma:0
batch_normalization_1/beta:0
conv2d_2/kernel:0
batch_normalization_2/gamma:0
batch_normalization_2/beta:0
conv2d_4/kernel:0
batch_normalization_4/gamma:0
batch_normalization_4/beta:0
conv2d_5/kernel:0
batch_normalization_5/gamma:0
batch_normalization_5/beta:0
conv2d_3/kernel:0
batch_normalization_3/gamma:0
batch_normalization_3/beta:0
conv2d_6/kernel:0
batch_normalization_6/gamma:0
batch_normalization_6/beta:0
conv2d_7/kernel:0
batch_normalization_7/gamma:0
batch_normalization_7/beta:0
conv2d_8/kernel:0
batch_normalization_8/gamma:0
batch_normalization_8/beta:0
conv2d_9/kernel:0
batch_normalization_9/gamma:0
batch_normalization_9/beta:0
conv2d_11/kernel:0
group_normalization_1/gamma:0
group_normalization_1/beta:0
conv2d_12/kernel:0
group_normalization_2/gamma:0
group_normalization_2/beta:0
conv2d_10/kernel:0
group_normalization/gamma:0
group_normalization/beta:0
conv2d_13/kernel:0
group_normalization_3/gamma:0
group_normalization_3/beta:0
conv2d_14/kernel:0
group_normalization_4/gamma:0
group_normalization_4/beta:0
conv2d_15/kernel:0
group_normalization_5/gamma:0
group_normalization_5/beta:0
conv2d_16/kernel:0
group_normalization_6/gamma:0
group_normalization_6/beta:0
conv2d_17/kernel:0
group_normalization_7/gamma:0
group_normalization_7/beta:0
conv2d_18/kernel:0
group_normalization_8/gamma:0
group_normalization_8/beta:0
conv2d_20/kernel:0
group_normalization_10/gamma:0
group_normalization_10/beta:0
conv2d_21/kernel:0
group_normalization_11/gamma:0
group_normalization_11/beta:0
conv2d_19/kernel:0
group_normalization_9/gamma:0
group_normalization_9/beta:0
conv2d_22/kernel:0
group_normalization_12/gamma:0
group_normalization_12/beta:0
conv2d_23/kernel:0
group_normalization_13/gamma:0
group_normalization_13/beta:0
conv2d_24/kernel:0
group_normalization_14/gamma:0
group_normalization_14/beta:0
conv2d_25/kernel:0
group_normalization_15/gamma:0
group_normalization_15/beta:0
conv2d_26/kernel:0
group_normalization_16/gamma:0
group_normalization_16/beta:0
conv2d_27/kernel:0
group_normalization_17/gamma:0
group_normalization_17/beta:0
conv2d_28/kernel:0
group_normalization_18/gamma:0
group_normalization_18/beta:0
conv2d_29/kernel:0
group_normalization_19/gamma:0
group_normalization_19/beta:0
conv2d_30/kernel:0
group_normalization_20/gamma:0
group_normalization_20/beta:0
conv2d_31/kernel:0
group_normalization_21/gamma:0
group_normalization_21/beta:0
conv2d_33/kernel:0
group_normalization_23/gamma:0
group_normalization_23/beta:0
conv2d_34/kernel:0
group_normalization_24/gamma:0
group_normalization_24/beta:0
conv2d_32/kernel:0
group_normalization_22/gamma:0
group_normalization_22/beta:0
conv2d_35/kernel:0
group_normalization_25/gamma:0
group_normalization_25/beta:0
conv2d_36/kernel:0
group_normalization_26/gamma:0
group_normalization_26/beta:0
conv2d_37/kernel:0
group_normalization_27/gamma:0
group_normalization_27/beta:0
conv2d_38/kernel:0
group_normalization_28/gamma:0
group_normalization_28/beta:0
conv2d_39/kernel:0
group_normalization_29/gamma:0
group_normalization_29/beta:0
conv2d_40/kernel:0
group_normalization_30/gamma:0
group_normalization_30/beta:0
conv2d_41/kernel:0
group_normalization_31/gamma:0
group_normalization_31/beta:0
conv2d_42/kernel:0
group_normalization_32/gamma:0
group_normalization_32/beta:0
conv2d_43/kernel:0
group_normalization_33/gamma:0
group_normalization_33/beta:0
conv2d_44/kernel:0
conv3d/kernel:0
group_normalization_34/gamma:0
group_normalization_34/beta:0
conv3d_1/kernel:0
group_normalization_35/gamma:0
group_normalization_35/beta:0
conv3d_2/kernel:0
group_normalization_36/gamma:0
group_normalization_36/beta:0
conv3d_3/kernel:0
group_normalization_37/gamma:0
group_normalization_37/beta:0
conv3d_transpose/kernel:0
group_normalization_38/gamma:0
group_normalization_38/beta:0
conv3d_transpose_1/kernel:0
group_normalization_39/gamma:0
group_normalization_39/beta:0
conv3d_4/kernel:0
group_normalization_40/gamma:0
group_normalization_40/beta:0
conv3d_5/kernel:0
batch_normalization/moving_mean:0
batch_normalization/moving_variance:0
batch_normalization_1/moving_mean:0
batch_normalization_1/moving_variance:0
batch_normalization_2/moving_mean:0
batch_normalization_2/moving_variance:0
batch_normalization_4/moving_mean:0
batch_normalization_4/moving_variance:0
batch_normalization_5/moving_mean:0
batch_normalization_5/moving_variance:0
batch_normalization_3/moving_mean:0
batch_normalization_3/moving_variance:0
batch_normalization_6/moving_mean:0
batch_normalization_6/moving_variance:0
batch_normalization_7/moving_mean:0
batch_normalization_7/moving_variance:0
batch_normalization_8/moving_mean:0
batch_normalization_8/moving_variance:0
batch_normalization_9/moving_mean:0
batch_normalization_9/moving_variance:0
print(dir(model.layers[9]))
['CV_X_MAX', 'CV_X_MIN', 'CV_Y_MAX', 'CV_Y_MIN', 'CV_Z_MAX', 'CV_Z_MIN', 'GRID_SIZE', 'RPN3D_INPUT_DIM', 'VOXEL_X_SIZE', 'VOXEL_Y_SIZE', 'VOXEL_Z_SIZE', 'X_MAX', 'X_MIN', 'Y_MAX', 'Y_MIN', 'Z_MAX', 'Z_MIN', '_TF_MODULE_IGNORED_PROPERTIES', '__abstractmethods__', '__call__', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_abc_cache', '_abc_negative_cache', '_abc_negative_cache_version', '_abc_registry', '_activity_regularizer', '_add_trackable', '_add_variable_with_custom_getter', '_auto_track_sub_layers', '_autocast', '_autographed_call', '_build_input_shape', '_call_accepts_kwargs', '_call_arg_was_passed', '_call_fn_arg_defaults', '_call_fn_arg_positions', '_call_fn_args', '_call_full_argspec', '_callable_losses', '_cast_single_input', '_checkpoint_dependencies', '_clear_losses', '_compute_dtype', '_compute_dtype_object', '_dedup_weights', '_default_training_arg', '_deferred_dependencies', '_dtype', '_dtype_policy', '_dynamic', '_eager_losses', '_expects_mask_arg', '_expects_training_arg', '_flatten', '_flatten_layers', '_functional_construction_call', '_gather_children_attribute', '_gather_saveables_for_checkpoint', '_get_call_arg_value', '_get_existing_metric', '_get_input_masks', '_get_node_attribute_at_index', '_get_save_spec', '_get_trainable_state', '_handle_activity_regularization', '_handle_deferred_dependencies', '_handle_weight_regularization', '_inbound_nodes', '_inbound_nodes_value', '_infer_output_signature', '_init_call_fn_args', '_init_set_name', '_initial_weights', '_input_spec', '_instrument_layer_creation', '_instrumented_keras_api', '_instrumented_keras_layer_class', '_instrumented_keras_model_class', '_is_layer', '_keras_api_names', '_keras_api_names_v1', '_keras_tensor_symbolic_call', '_layers', '_list_extra_dependencies_for_serialization', '_list_functions_for_serialization', '_lookup_dependency', '_losses', '_map_resources', '_maybe_build', '_maybe_cast_inputs', '_maybe_create_attribute', '_maybe_initialize_trackable', '_metrics', '_metrics_lock', '_must_restore_from_config', '_name', '_name_based_attribute_restore', '_name_based_restores', '_name_scope', '_no_dependency', '_non_trainable_weights', '_obj_reference_counts', '_obj_reference_counts_dict', '_object_identifier', '_outbound_nodes', '_outbound_nodes_value', '_preload_simple_restoration', '_preserve_input_structure_in_config', '_restore_from_checkpoint_position', '_saved_model_inputs_spec', '_self_name_based_restores', '_self_saveable_object_factories', '_self_setattr_tracking', '_self_unconditional_checkpoint_dependencies', '_self_unconditional_deferred_dependencies', '_self_unconditional_dependency_names', '_self_update_uid', '_set_call_arg_value', '_set_connectivity_metadata', '_set_dtype_policy', '_set_mask_keras_history_checked', '_set_mask_metadata', '_set_save_spec', '_set_trainable_state', '_set_training_mode', '_setattr_tracking', '_should_cast_single_input', '_single_restoration_from_checkpoint_position', '_split_out_first_arg', '_stateful', '_supports_masking', '_symbolic_call', '_tf_api_names', '_tf_api_names_v1', '_thread_local', '_track_trackable', '_trackable_saved_model_saver', '_tracking_metadata', '_trainable', '_trainable_weights', '_unconditional_checkpoint_dependencies', '_unconditional_dependency_names', '_update_uid', '_updates', 'activity_regularizer', 'add_loss', 'add_metric', 'add_update', 'add_variable', 'add_weight', 'anchor_angles', 'apply', 'box_corner_parameters', 'build', 'built', 'call', 'cat_disp', 'cat_img_feature', 'cat_right_img_feature', 'centerness4class', 'cfg', 'class4angles', 'classif1', 'compute_dtype', 'compute_mask', 'compute_output_shape', 'compute_output_signature', 'coord_rect', 'count_params', 'dispregression', 'downsample_disp', 'dres0', 'dtype', 'dtype_policy', 'dynamic', 'feature_extraction', 'fix_centerness_bug', 'from_config', 'get_clusterable_algorithm', 'get_clusterable_weights', 'get_config', 'get_input_at', 'get_input_mask_at', 'get_input_shape_at', 'get_losses_for', 'get_output_at', 'get_output_mask_at', 'get_output_shape_at', 'get_updates_for', 'get_weights', 'hg_cv', 'hg_firstconv', 'hg_rpn_conv', 'hg_rpn_conv3d', 'img_feature_attentionbydisp', 'inbound_nodes', 'input', 'input_mask', 'input_shape', 'input_spec', 'losses', 'maxdisp', 'metrics', 'name', 'name_scope', 'non_trainable_variables', 'non_trainable_weights', 'num_3dconvs', 'num_angles', 'num_classes', 'num_convs', 'outbound_nodes', 'output', 'output_mask', 'output_shape', 'rpn3d_conv_kernel', 'set_weights', 'stateful', 'submodules', 'supports_masking', 'trainable', 'trainable_variables', 'trainable_weights', 'updates', 'upsample0', 'valid_classes', 'variable_dtype', 'variables', 'voxel_attentionbydisp', 'weights', 'with_name_scope']
Hi @tisma To pass to the clustering algorithm what should be clustered in your layer, you need to take a look at attributes with weights. This is an advanced usage, so it might be not so convenient, but I would put a breakpoint to see where weights are stored. For example, for MHA layer we have 4 types of weights. To pass them for clustering I would re-define my function like this:
def get_clusterable_weights_mha():
return [('_query_dense.kernel', layer._query_dense.kernel),
('_key_dense.kernel', layer._key_dense.kernel),
('_value_dense.kernel', layer._value_dense.kernel),
('_output_dense.kernel', layer._output_dense.kernel)]
Describe the bug Problem with custom layer weights clustering. When layer implements
ClusterableLayer
it should overrideget_clusterable_weights
but later call ofget_weights_from_layer
causesAttributeError
System information
TensorFlow version (installed from source or binary): Installed with
pip
,tensorflow-gpu 2.6.0
TensorFlow Model Optimization version (installed from source or binary): Installed with
pip
,tensorflow-model-optimization 0.6.0
Python version:
Python 3.6.9
Describe the expected behavior
Describe the current behavior
Code to reproduce the issue
The output is:
But if I run this snippet:
It will found the weight:
My assumption is that implementation of
get_weight_from_layer(self, weight_name)
: https://github.com/tensorflow/model-optimization/blob/18e87d262e536c9a742aef700880e71b47a7f768/tensorflow_model_optimization/python/core/clustering/keras/cluster_wrapper.py#L144-L145 is incorrect.