tech-srl / code2vec

TensorFlow code for the neural network presented in the paper: "code2vec: Learning Distributed Representations of Code"
https://code2vec.org
MIT License
1.1k stars 286 forks source link

Matrix size-incompatible during using sample model #174

Open Ytz-Ichi opened 1 year ago

Ytz-Ichi commented 1 year ago

Hello.

I got the following error when I tried to Evaluating with sample model to compare it with my own data set. I have looked and cannot find any information that might lead to a solution. I would like a reply as soon as possible for my convenience, but advice is always welcome!

2023-01-16 14:50:02,379 INFO     
2023-01-16 14:50:02,379 INFO     
2023-01-16 14:50:02,379 INFO     ---------------------------------------------------------------------
2023-01-16 14:50:02,379 INFO     ---------------------------------------------------------------------
2023-01-16 14:50:02,379 INFO     ---------------------- Creating code2vec model ----------------------
2023-01-16 14:50:02,379 INFO     ---------------------------------------------------------------------
2023-01-16 14:50:02,379 INFO     ---------------------------------------------------------------------
2023-01-16 14:50:02,379 INFO     Checking number of examples ...
2023-01-16 14:50:02,380 INFO         Number of test examples: 368445
2023-01-16 14:50:02,380 INFO     ---------------------------------------------------------------------
2023-01-16 14:50:02,380 INFO     ----------------- Configuration - Hyper Parameters ------------------
2023-01-16 14:50:02,380 INFO     CODE_VECTOR_SIZE                          192
2023-01-16 14:50:02,380 INFO     CSV_BUFFER_SIZE                           104857600
2023-01-16 14:50:02,380 INFO     DEFAULT_EMBEDDINGS_SIZE                   64
2023-01-16 14:50:02,380 INFO     DL_FRAMEWORK                              tensorflow
2023-01-16 14:50:02,380 INFO     DROPOUT_KEEP_RATE                         0.75
2023-01-16 14:50:02,380 INFO     EXPORT_CODE_VECTORS                       False
2023-01-16 14:50:02,380 INFO     LOGS_PATH                                 None
2023-01-16 14:50:02,380 INFO     MAX_CONTEXTS                              200
2023-01-16 14:50:02,380 INFO     MAX_PATH_VOCAB_SIZE                       911417
2023-01-16 14:50:02,380 INFO     MAX_TARGET_VOCAB_SIZE                     261245
2023-01-16 14:50:02,380 INFO     MAX_TOKEN_VOCAB_SIZE                      1301136
2023-01-16 14:50:02,380 INFO     MAX_TO_KEEP                               10
2023-01-16 14:50:02,380 INFO     MODEL_LOAD_PATH                           models/java14_model/saved_model_iter8.release
2023-01-16 14:50:02,380 INFO     MODEL_SAVE_PATH                           None
2023-01-16 14:50:02,380 INFO     NUM_BATCHES_TO_LOG_PROGRESS               100
2023-01-16 14:50:02,380 INFO     NUM_TEST_EXAMPLES                         368445
2023-01-16 14:50:02,380 INFO     NUM_TRAIN_BATCHES_TO_EVALUATE             1800
2023-01-16 14:50:02,380 INFO     NUM_TRAIN_EPOCHS                          20
2023-01-16 14:50:02,380 INFO     NUM_TRAIN_EXAMPLES                        0
2023-01-16 14:50:02,380 INFO     PATH_EMBEDDINGS_SIZE                      64
2023-01-16 14:50:02,380 INFO     PREDICT                                   False
2023-01-16 14:50:02,380 INFO     READER_NUM_PARALLEL_BATCHES               6
2023-01-16 14:50:02,380 INFO     RELEASE                                   False
2023-01-16 14:50:02,380 INFO     SAVE_EVERY_EPOCHS                         1
2023-01-16 14:50:02,381 INFO     SAVE_T2V                                  None
2023-01-16 14:50:02,381 INFO     SAVE_W2V                                  None
2023-01-16 14:50:02,381 INFO     SEPARATE_OOV_AND_PAD                      False
2023-01-16 14:50:02,381 INFO     SHUFFLE_BUFFER_SIZE                       10000
2023-01-16 14:50:02,381 INFO     TARGET_EMBEDDINGS_SIZE                    192
2023-01-16 14:50:02,381 INFO     TEST_BATCH_SIZE                           512
2023-01-16 14:50:02,381 INFO     TEST_DATA_PATH                            data/java14m/java14m.test.c2v
2023-01-16 14:50:02,381 INFO     TOKEN_EMBEDDINGS_SIZE                     64
2023-01-16 14:50:02,381 INFO     TOP_K_WORDS_CONSIDERED_DURING_PREDICTION  10
2023-01-16 14:50:02,381 INFO     TRAIN_BATCH_SIZE                          512
2023-01-16 14:50:02,381 INFO     TRAIN_DATA_PATH_PREFIX                    None
2023-01-16 14:50:02,381 INFO     USE_TENSORBOARD                           False
2023-01-16 14:50:02,381 INFO     VERBOSE_MODE                              1
2023-01-16 14:50:02,381 INFO     _Config__logger                           <Logger code2vec (INFO)>
2023-01-16 14:50:02,381 INFO     context_vector_size                       192
2023-01-16 14:50:02,381 INFO     entire_model_load_path                    models/java14_model/saved_model_iter8.release__entire-model
2023-01-16 14:50:02,381 INFO     entire_model_save_path                    None
2023-01-16 14:50:02,381 INFO     is_loading                                True
2023-01-16 14:50:02,381 INFO     is_saving                                 False
2023-01-16 14:50:02,381 INFO     is_testing                                True
2023-01-16 14:50:02,381 INFO     is_training                               False
2023-01-16 14:50:02,381 INFO     model_load_dir                            models/java14_model
2023-01-16 14:50:02,381 INFO     model_weights_load_path                   models/java14_model/saved_model_iter8.release__only-weights
2023-01-16 14:50:02,381 INFO     model_weights_save_path                   None
2023-01-16 14:50:02,381 INFO     test_steps                                720
2023-01-16 14:50:02,381 INFO     train_data_path                           None
2023-01-16 14:50:02,381 INFO     train_steps_per_epoch                     0
2023-01-16 14:50:02,381 INFO     word_freq_dict_path                       None
2023-01-16 14:50:02,381 INFO     ---------------------------------------------------------------------
2023-01-16 14:50:02,381 INFO     Loading model vocabularies from: `models/java14_model/dictionaries.bin` ... 
2023-01-16 14:50:03,444 INFO     Done loading model vocabularies.
2023-01-16 14:50:03,913 INFO     Done creating code2vec model
2023-01-16 14:50:10.149944: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:357] MLIR V1 optimization pass is not enabled
2023-01-16 14:50:20,845 INFO     Initalized variables
2023-01-16 14:50:20,845 INFO     Loading model weights from: models/java14_model/saved_model_iter8.release
2023-01-16 14:50:21,740 INFO     Done loading model weights
2023-01-16 14:50:22,194 INFO     Starting evaluation
Traceback (most recent call last):
  File "/home/myPC/.local/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 1378, in _do_call
    return fn(*args)
  File "/home/myPC/.local/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 1361, in _run_fn
    return self._call_tf_sessionrun(options, feed_dict, fetch_list,
  File "/home/myPC/.local/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 1454, in _call_tf_sessionrun
    return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
  (0) INVALID_ARGUMENT: Matrix size-incompatible: In[0]: [204800,192], In[1]: [384,384]
     [[{{node model/MatMul}}]]
     [[TopKV2/_25]]
  (1) INVALID_ARGUMENT: Matrix size-incompatible: In[0]: [204800,192], In[1]: [384,384]
     [[{{node model/MatMul}}]]
0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/myPC/ytz/code2vec/code2vec.py", line 31, in <module>
    eval_results = model.evaluate()
  File "/home/myPC/ytz/code2vec/tensorflow_model.py", line 158, in evaluate
    top_words, top_scores, original_names, code_vectors = self.sess.run(
  File "/home/myPC/.local/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 968, in run
    result = self._run(None, fetches, feed_dict, options_ptr,
  File "/home/myPC/.local/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 1191, in _run
    results = self._do_run(handle, final_targets, final_fetches,
  File "/home/myPC/.local/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 1371, in _do_run
    return self._do_call(_run_fn, feeds, fetches, targets, options,
  File "/home/myPC/.local/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 1397, in _do_call
    raise type(e)(node_def, op, message)  # pylint: disable=no-value-for-parameter
tensorflow.python.framework.errors_impl.InvalidArgumentError: Graph execution error:

Detected at node 'model/MatMul' defined at (most recent call last):
    File "/home/myPC/ytz/code2vec/code2vec.py", line 31, in <module>
      eval_results = model.evaluate()
    File "/home/myPC/ytz/code2vec/tensorflow_model.py", line 125, in evaluate
      self.eval_code_vectors = self._build_tf_test_graph(input_tensors)
    File "/home/myPC/ytz/code2vec/tensorflow_model.py", line 292, in _build_tf_test_graph
      code_vectors, attention_weights = self._calculate_weighted_contexts(
    File "/home/myPC/ytz/code2vec/tensorflow_model.py", line 252, in _calculate_weighted_contexts
      flat_embed = tf.tanh(tf.matmul(flat_embed, transform_param))  # (batch * max_contexts, dim * 3)
Node: 'model/MatMul'
Detected at node 'model/MatMul' defined at (most recent call last):
    File "/home/myPC/ytz/code2vec/code2vec.py", line 31, in <module>
      eval_results = model.evaluate()
    File "/home/myPC/ytz/code2vec/tensorflow_model.py", line 125, in evaluate
      self.eval_code_vectors = self._build_tf_test_graph(input_tensors)
    File "/home/myPC/ytz/code2vec/tensorflow_model.py", line 292, in _build_tf_test_graph
      code_vectors, attention_weights = self._calculate_weighted_contexts(
    File "/home/myPC/ytz/code2vec/tensorflow_model.py", line 252, in _calculate_weighted_contexts
      flat_embed = tf.tanh(tf.matmul(flat_embed, transform_param))  # (batch * max_contexts, dim * 3)
Node: 'model/MatMul'
2 root error(s) found.
  (0) INVALID_ARGUMENT: Matrix size-incompatible: In[0]: [204800,192], In[1]: [384,384]
     [[{{node model/MatMul}}]]
     [[TopKV2/_25]]
  (1) INVALID_ARGUMENT: Matrix size-incompatible: In[0]: [204800,192], In[1]: [384,384]
     [[{{node model/MatMul}}]]
0 successful operations.
0 derived errors ignored.

Original stack trace for 'model/MatMul':
  File "/home/myPC/ytz/code2vec/code2vec.py", line 31, in <module>
    eval_results = model.evaluate()
  File "/home/myPC/ytz/code2vec/tensorflow_model.py", line 125, in evaluate
    self.eval_code_vectors = self._build_tf_test_graph(input_tensors)
  File "/home/myPC/ytz/code2vec/tensorflow_model.py", line 292, in _build_tf_test_graph
    code_vectors, attention_weights = self._calculate_weighted_contexts(
  File "/home/myPC/ytz/code2vec/tensorflow_model.py", line 252, in _calculate_weighted_contexts
    flat_embed = tf.tanh(tf.matmul(flat_embed, transform_param))  # (batch * max_contexts, dim * 3)
  File "/home/myPC/.local/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py", line 150, in error_handler
    return fn(*args, **kwargs)
  File "/home/myPC/.local/lib/python3.10/site-packages/tensorflow/python/util/dispatch.py", line 1176, in op_dispatch_handler
    return dispatch_target(*args, **kwargs)
  File "/home/myPC/.local/lib/python3.10/site-packages/tensorflow/python/ops/math_ops.py", line 3714, in matmul
    return gen_math_ops.mat_mul(
  File "/home/myPC/.local/lib/python3.10/site-packages/tensorflow/python/ops/gen_math_ops.py", line 6035, in mat_mul
    _, _, _op, _outputs = _op_def_library._apply_op_helper(
  File "/home/myPC/.local/lib/python3.10/site-packages/tensorflow/python/framework/op_def_library.py", line 795, in _apply_op_helper
    op = g._create_op_internal(op_type_name, inputs, dtypes=None,
  File "/home/myPC/.local/lib/python3.10/site-packages/tensorflow/python/framework/ops.py", line 3798, in _create_op_internal
    ret = Operation(
urialon commented 1 year ago

Hi @Ytz-Ichi , Thank you for your interest in our work!

Can you please detail exactly what you ran? How did you process your own dataset? Does it work with our datasets?

Best, Uri

Ytz-Ichi commented 1 year ago

I tried git cloning code2vec again and all went well. I don't know what caused it, but it was solved, so that's good:)