mlfpm / deepof

DeepLabCut based data analysis package including pose estimation and representation learning mediated behavior recognition
MIT License
37 stars 6 forks source link

deep_unsupervised_embedding CensNetConv ValueError #50

Open micha-blip opened 9 hours ago

micha-blip commented 9 hours ago

Hi, I am getting an error while trying to run deep_unsupervised_embedding on my project. I suspect there is some mismatch between the expected and received input shape for the CensNetConv network but I don't know how to fix this. I am using a custom labeling scheme, can this be the issue? I am attaching the error message. Thank you in advance!

`--------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[211], line 1 ----> 1 trained_model = my_project.deep_unsupervised_embedding( 2 preprocessed_object=graph_preprocessed_coords, # Change to preprocessed_coords to use non-graph embeddings 3 adjacency_matrix=adj_matrix, 4 embedding_model="VaDE", # Can also be set to 'VQVAE' and 'Contrastive' 5 epochs=10, 6 encoder_type="recurrent", # Can also be set to 'TCN' and 'transformer' 7 n_components=10, 8 latent_dim=6, 9 batch_size=1024, 10 verbose=True, # Set to True to follow the training loop 11 interaction_regularization=0.0, 12 pretrained=False, # Set to False to train a new model! 13 )

File ~\anaconda3\envs\deepof39\lib\site-packages\deepof\data.py:1956, in Coordinates.deep_unsupervised_embedding(self, preprocessed_object, adjacency_matrix, embedding_model, encoder_type, batch_size, latent_dim, epochs, log_history, log_hparams, n_components, kmeans_loss, temperature, contrastive_similarity_function, contrastive_loss_function, beta, tau, output_path, pretrained, save_checkpoints, save_weights, input_type, run, kl_annealing_mode, kl_warmup, reg_cat_clusters, recluster, interaction_regularization, kwargs) 1939 pretrained = os.path.join( 1940 pretrained_path, 1941 ( (...) 1952 ), 1953 ) 1955 try: -> 1956 trained_models = deepof.model_utils.embedding_model_fitting( 1957 preprocessed_object=preprocessed_object, 1958 adjacency_matrix=adjacency_matrix, 1959 embedding_model=embedding_model, 1960 encoder_type=encoder_type, 1961 batch_size=batch_size, 1962 latent_dim=latent_dim, 1963 epochs=epochs, 1964 log_history=log_history, 1965 log_hparams=log_hparams, 1966 n_components=n_components, 1967 kmeans_loss=kmeans_loss, 1968 temperature=temperature, 1969 contrastive_similarity_function=contrastive_similarity_function, 1970 contrastive_loss_function=contrastive_loss_function, 1971 beta=beta, 1972 tau=tau, 1973 output_path=os.path.join( 1974 self._project_path, 1975 self._project_name, 1976 output_path, 1977 "Trained_models", 1978 ), 1979 pretrained=pretrained, 1980 save_checkpoints=save_checkpoints, 1981 save_weights=save_weights, 1982 input_type=input_type, 1983 run=run, 1984 kl_annealing_mode=kl_annealing_mode, 1985 kl_warmup=kl_warmup, 1986 reg_cat_clusters=reg_cat_clusters, 1987 recluster=recluster, 1988 interaction_regularization=interaction_regularization, 1989 kwargs, 1990 ) 1991 except IndexError: 1992 raise ValueError( 1993 "No pretrained model found for the given parameters. Please train a model first." 1994 )

File ~\anaconda3\envs\deepof39\lib\site-packages\deepof\model_utils.py:1304, in embedding_model_fitting(preprocessed_object, adjacency_matrix, embedding_model, encoder_type, batch_size, latent_dim, epochs, log_history, log_hparams, n_components, output_path, kmeans_loss, pretrained, save_checkpoints, save_weights, input_type, kl_annealing_mode, kl_warmup, reg_cat_clusters, recluster, temperature, contrastive_similarity_function, contrastive_loss_function, beta, tau, interaction_regularization, run, **kwargs) 1299 ae_full_model.optimizer = tf.keras.optimizers.Nadam( 1300 learning_rate=1e-4, clipvalue=0.75 1301 ) 1303 elif embedding_model == "VaDE": -> 1304 ae_full_model = deepof.models.VaDE( 1305 input_shape=X_train.shape, 1306 edge_feature_shape=a_train.shape, 1307 adjacency_matrix=adjacency_matrix, 1308 batch_size=batch_size, 1309 latent_dim=latent_dim, 1310 use_gnn=len(preprocessed_object) == 6, 1311 kl_annealing_mode=kl_annealing_mode, 1312 kl_warmup_epochs=kl_warmup, 1313 montecarlo_kl=100, 1314 n_components=n_components, 1315 reg_cat_clusters=reg_cat_clusters, 1316 encoder_type=encoder_type, 1317 interaction_regularization=interaction_regularization, 1318 ) 1320 elif embedding_model == "Contrastive": 1321 ae_full_model = deepof.models.Contrastive( 1322 input_shape=X_train.shape, 1323 edge_feature_shape=a_train.shape, (...) 1333 tau=tau, 1334 )

File ~\anaconda3\envs\deepof39\lib\site-packages\deepof\models.py:1571, in VaDE.init(self, input_shape, edge_feature_shape, adjacency_matrix, latent_dim, use_gnn, n_components, batch_size, kl_annealing_mode, kl_warmup_epochs, montecarlo_kl, kmeans_loss, reg_cat_clusters, reg_cluster_variance, encoder_type, interaction_regularization, **kwargs) 1568 self.interaction_regularization = interaction_regularization 1570 # Define VaDE model -> 1571 self.encoder, self.decoder, self.grouper, self.vade = get_vade( 1572 input_shape=self.seq_shape, 1573 edge_feature_shape=self.edge_feature_shape, 1574 adjacency_matrix=self.adjacency_matrix, 1575 n_components=self.n_components, 1576 latent_dim=self.latent_dim, 1577 use_gnn=use_gnn, 1578 batch_size=self.batch_size, 1579 kl_warmup=self.kl_warmup, 1580 kl_annealing_mode=self.kl_annealing_mode, 1581 mc_kl=self.mc_kl, 1582 kmeans_loss=self.kmeans, 1583 reg_cluster_variance=self.reg_cluster_variance, 1584 encoder_type=self.encoder_type, 1585 interaction_regularization=self.interaction_regularization, 1586 ) 1588 # Propagate the optimizer to all relevant sub-models, to enable metric annealing 1589 self.vade.optimizer = self.optimizer

File ~\anaconda3\envs\deepof39\lib\site-packages\deepof\models.py:1365, in get_vade(input_shape, edge_feature_shape, adjacency_matrix, latent_dim, use_gnn, n_components, batch_size, kl_warmup, kl_annealing_mode, mc_kl, kmeans_loss, reg_cluster_variance, encoder_type, interaction_regularization) 1339 """Build a Gaussian mixture variational autoencoder (VaDE) model, adapted to the DeepOF setting. 1340 1341 Args: (...) 1362 1363 """ 1364 if encoder_type == "recurrent": -> 1365 encoder = get_recurrent_encoder( 1366 input_shape=input_shape[1:], 1367 adjacency_matrix=adjacency_matrix, 1368 edge_feature_shape=edge_feature_shape[1:], 1369 latent_dim=latent_dim, 1370 use_gnn=use_gnn, 1371 interaction_regularization=interaction_regularization, 1372 ) 1373 decoder = get_recurrent_decoder( 1374 input_shape=input_shape[1:], latent_dim=latent_dim 1375 ) 1377 elif encoder_type == "TCN":

File ~\anaconda3\envs\deepof39\lib\site-packages\deepof\utils.py:58, in _suppress_warning..somedec_outer..somedec_inner(*args, *kwargs) 56 for k in range(0, len(warn_messages)): 57 warnings.filterwarnings("ignore", message=warn_messages[k]) ---> 58 response = fn(args, **kwargs) 59 return response

File ~\anaconda3\envs\deepof39\lib\site-packages\deepof\models.py:138, in get_recurrent_encoder(input_shape, edge_feature_shape, adjacency_matrix, latent_dim, use_gnn, gru_unroll, bidirectional_merge, interaction_regularization) 133 laplacian, edge_laplacian, incidence = spatial_block.preprocess( 134 adjacency_matrix 135 ) 137 # Get and concatenate node and edge embeddings --> 138 x_nodes, x_edges = spatial_block( 139 [encoder, (laplacian, edge_laplacian, incidence), a_encoder], mask=None 140 ) 142 x_nodes = tf.reshape( 143 x_nodes, 144 [-1, adjacency_matrix.shape[-1] latent_dim], 145 ) 147 x_edges = tf.reshape( 148 x_edges, 149 [-1, edge_feature_shape[-1] latent_dim], 150 )

File ~\anaconda3\envs\deepof39\lib\site-packages\keras\src\utils\traceback_utils.py:70, in filter_traceback..error_handler(*args, **kwargs) 67 filtered_tb = _process_traceback_frames(e.traceback) 68 # To get the full stack trace, call: 69 # tf.debugging.disable_traceback_filtering() ---> 70 raise e.with_traceback(filtered_tb) from None 71 finally: 72 del filtered_tb

File ~\AppData\Local\Temp__autograph_generated_filef7qpp_l5.py:14, in outer_factory..inner_factory..tf_inner_check_dtypes(inputs, **kwargs) 12 try: 13 doreturn = True ---> 14 retval = ag__.converted_call(ag.ld(call), (ag.ld(inputs),), dict(**ag.ld(kwargs)), fscope) 15 except: 16 do_return = False

File ~\AppData\Local\Temp__autograph_generated_file0aae59tb.py:10, in outer_factory..inner_factory..tfcall(self, inputs, mask) 8 doreturn = False 9 retval = ag.UndefinedReturnValue() ---> 10 node_features = ag.converted_call(ag.ld(self)._propagate_nodes, (ag.ld(inputs),), dict(mask=ag__.ld(mask)), fscope) 11 edge_features = ag.converted_call(ag.ld(self)._propagate_edges, (ag.ld(inputs),), dict(mask=ag__.ld(mask)), fscope) 12 try:

File ~\AppData\Local\Temp__autograph_generated_filexq0be7az.py:21, in outer_factory..inner_factory..tf_propagate_nodes(self, inputs, mask) 19 weighted_edge_features = ag.converted_call(ag.ld(tf).squeeze, (ag.ld(weighted_edge_features),), dict(axis=[-1]), fscope) 20 weighted_edge_features = ag.converted_call(ag.ld(tf).linalg.diag, (ag.ld(weighted_edge_features),), None, fscope) ---> 21 weighted_edge_features = ag__.converted_call(ag.ld(ops).modal_dot, (ag.ld(incidence), ag.ld(weighted_edge_features)), None, fscope) 22 weighted_edge_features = ag.converted_call(ag__.ld(ops).modal_dot, (ag.ld(weighted_edge_features), ag.ld(incidence)), dict(transpose_b=True), fscope) 23 node_adjacency = ag.ld(weighted_edge_features) * ag__.ld(laplacian)

File ~\AppData\Local\Temp__autograph_generated_fileophrjtc0.py:164, in outer_factory..inner_factory..tfmodal_dot(a, b, transpose_a, transpose_b) 162 a_shape = ag.Undefined('a_shape') 163 output = ag.Undefined('output') --> 164 ag.if_stmt(ag__.ld(a_ndim) == ag__.ld(b_ndim), if_body_4, else_body_4, get_state_4, set_state_4, ('doreturn', 'retval'), 2) 165 return fscope.ret(retval_, do_return)

File ~\AppData\Local\Temp__autograph_generated_fileophrjtc0.py:159, in outer_factory..inner_factory..tfmodal_dot..else_body_4() 157 a_shape = ag.Undefined('a_shape') 158 output = ag.Undefined('output') --> 159 ag.if_stmt(ag__.ld(a_ndim) == 2, if_body_3, else_body_3, get_state_3, set_state_3, ('doreturn', 'retval'), 2)

File ~\AppData\Local\Temp__autograph_generated_fileophrjtc0.py:114, in outer_factory..inner_factory..tfmodal_dot..else_body_4..if_body_3() 112 try: 113 doreturn = True --> 114 retval = ag__.converted_call(ag.ld(mixed_mode_dot), (ag.ld(a), ag.ld(b)), None, fscope) 115 except: 116 do_return = False

File ~\AppData\Local\Temp__autograph_generated_filezmoi1tmd.py:22, in outer_factory..inner_factory..tfmixed_mode_dot(a, b) 20 b_t = ag__.converted_call(ag.ld(ops).transpose, (ag.ld(b), (1, 2, 0)), None, fscope) 21 b_t = ag__.converted_call(ag.ld(ops).reshape, (ag.ld(b_t), ag__.converted_call(ag.ld(tf).stack, ((ag.ld(b_shp)[1], -1),), None, fscope)), None, fscope) ---> 22 output = ag__.converted_call(ag.ld(dot), (ag.ld(a), ag.ld(b_t)), None, fscope) 23 output = ag.converted_call(ag.ld(ops).reshape, (ag.ld(output), ag__.converted_call(ag.ld(tf).stack, ((ag.ld(a_shp)[0], ag__.ld(b_shp)[2], -1),), None, fscope)), None, fscope) 24 output = ag.converted_call(ag.ld(ops).transpose, (ag.ld(output), (2, 0, 1)), None, fscope)

File ~\AppData\Local\Temp__autograph_generated_fileqvi9eqq8.py:180, in outer_factory..inner_factory..tfdot(a, b) 178 pass 179 out = ag.Undefined('out') --> 180 ag__.if_stmt(ag_.not(do_return), if_body_7, else_body_7, get_state_7, set_state_7, ('doreturn', 'retval', 'a', 'b'), 2) 181 return fscope.ret(retval_, do_return)

File ~\AppData\Local\Temp__autograph_generated_fileqvi9eqq8.py:174, in outer_factory..inner_factory..tfdot..if_body_7() 172 raise 173 out = ag.Undefined('out') --> 174 ag.if_stmt(ag_.or(lambda : ag.ld(a_is_sparse), lambda : ag__.ld(b_is_sparse)), if_body_6, else_body_6, get_state_6, set_state_6, ('doreturn', 'retval'), 2)

File ~\AppData\Local\Temp__autograph_generated_fileqvi9eqq8.py:169, in outer_factory..inner_factory..tfdot..if_body_7..else_body_6() 167 try: 168 doreturn = True --> 169 retval = ag.converted_call(ag.ld(tf).matmul, (ag.ld(a), ag__.ld(b)), None, fscope) 170 except: 171 do_return = False

ValueError: Exception encountered when calling layer "cens_net_conv_20" (type CensNetConv).

in user code:

File "C:\Users\n2800\anaconda3\envs\deepof39\lib\site-packages\spektral\layers\convolutional\conv.py", line 228, in _inner_check_dtypes  *
    return call(inputs, **kwargs)
File "C:\Users\n2800\anaconda3\envs\deepof39\lib\site-packages\spektral\layers\convolutional\censnet_conv.py", line 226, in call  *
    node_features = self._propagate_nodes(inputs, mask=mask)
File "C:\Users\n2800\anaconda3\envs\deepof39\lib\site-packages\spektral\layers\convolutional\censnet_conv.py", line 189, in _propagate_nodes  *
    weighted_edge_features = ops.modal_dot(incidence, weighted_edge_features)
File "C:\Users\n2800\anaconda3\envs\deepof39\lib\site-packages\spektral\layers\ops\matmul.py", line 134, in modal_dot  *
    return mixed_mode_dot(a, b)
File "C:\Users\n2800\anaconda3\envs\deepof39\lib\site-packages\spektral\layers\ops\matmul.py", line 75, in mixed_mode_dot  *
    output = dot(a, b_t)
File "C:\Users\n2800\anaconda3\envs\deepof39\lib\site-packages\spektral\layers\ops\matmul.py", line 58, in dot  *
    return tf.matmul(a, b)

ValueError: Dimensions must be equal, but are 24 and 20 for '{{node cens_net_conv_20/MatMul_1}} = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false](cens_net_conv_20/Shape/528037, cens_net_conv_20/Reshape)' with input shapes: [22,24], [20,?].

Call arguments received by layer "cens_net_conv_20" (type CensNetConv): • inputs=['tf.Tensor(shape=(None, 22, 12), dtype=float32)', ('tf.Tensor(shape=(22, 22), dtype=float32)', 'tf.Tensor(shape=(24, 24), dtype=float32)', 'tf.Tensor(shape=(22, 24), dtype=float32)'), 'tf.Tensor(shape=(None, 20, 12), dtype=float32)'] • mask=None`

NoCreativeIdeaForGoodUserName commented 6 hours ago

Yes, this appears to be a shape mismatch issue. I see that you are basically using the same configuration as in the tutorial and correctly deactivated the pre-trained option (since you are training a new model with a different labelling scheme). I.e. the inputs by themselves look correct.

I assume you already tried to run the training with the Tutorial examples and it worked? Depending on how "custom" your custom labeling scheme is, it could potentially be part of the issue. Did you plot the adjacency matrix and it looked as you expected? I would need to know a bit more about your dataset to be able to help you.