sjakati98 / CodeSearchNet

Datasets, tools, and benchmarks for representation learning of code.
https://arxiv.org/abs/1909.09436
MIT License
2 stars 0 forks source link

Add ElmoEncoder and ElmoModel #2

Closed sjakati98 closed 4 years ago

sjakati98 commented 4 years ago

Added the model. Not sure about the batch sizings.

seangtkelley commented 4 years ago
Tokenizing and building vocabulary for code snippets and queries.  This step may take several hours.
2019-11-28 00:36:44.687525: W tensorflow/core/graph/graph_constructor.cc:1265] Importing a graph with a lower producer version 26 into an existing graph with producer version 27. Shape inference will have run different parts of the graph with different producer versions.2019-11-28 00:36:44.687525: W tensorflow/core/graph/graph_constructor.cc:1265] Importing a graph with a lower producer version 26 into an existing graph with producer version 27. Shape inference will have run different parts of the graph with different producer versions.Starting training run elmo-2019-11-28-00-32-43 of model ElmoModel with following hypers:
{'code_token_vocab_size': 10000, 'code_token_vocab_count_threshold': 10, 'code_token_embedding_size': 128, 'code_use_subtokens': False, 'code_mark_subtoken_end': True, 'code_max_num_tokens': 200, 'code_use_bpe': True, 'code_pct_bpe': 0.5, 'code_embedding_type': 'elmo', 'code_elmo_pool_mode': 'weighted_mean', 'query_token_vocab_size': 10000, 'query_token_vocab_count_threshold': 10, 'query_token_embedding_size': 128, 'query_use_subtokens': False, 'query_mark_subtoken_end': False, 'query_max_num_tokens': 30, 'query_use_bpe': True, 'query_pct_bpe': 0.5, 'query_embedding_type': 'elmo', 'query_elmo_pool_mode': 'weighted_mean', 'batch_size': 1000, 'optimizer': 'Adam', 'seed': 0, 'dropout_keep_rate': 0.9, 'learning_rate': 0.01, 'learning_rate_code_scale_factor': 1.0, 'learning_rate_query_scale_factor': 1.0, 'learning_rate_decay': 0.98, 'momentum': 0.85, 'gradient_clip': 1, 'loss': 'softmax', 'margin': 1, 'max_epochs': 2, 'patience': 5, 'fraction_using_func_name': 0.1, 'min_len_func_name_for_query': 12, 'query_random_token_frequency': 0.0}
Loading training and validation data.
Begin Training.
Begin Training.
Training on 30000 go, 30000 java, 30000 javascript, 30000 php, 30000 python, 30000 ruby samples.
Validating on 8253 javascript, 26015 php, 2209 ruby, 23107 python, 15328 java, 14242 go samples.
==== Epoch 0 ====
Traceback (most recent call last):amples). Processed 0 samples. Loss so far: 0.0000.  MRR so far: 0.0000
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1334, in _do_call
    return fn(*args)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: len(seq_lens) != input.dims(0), (185 vs. 0)
         [[{{node code_encoder/python/elmo_encoder/module_apply_tokens/bilm/ReverseSequence}} = ReverseSequence[T=DT_FLOAT, Tlen=DT_INT32, batch_dim=0, seq_dim=1, _device="/job:localhost/replica:0/task:0/device:GPU:0"](code_encoder/python/elmo_encoder/module_apply_tokens/bilm/Reshape_1, _arg_code_encoder/python/elmo_encoder/tokens_lengths_0_18/_1397)]]
         [[{{node mul_92/_2173}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_25634_mul_92", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "train.py", line 191, in <module>
    run_and_debug(lambda: run(args), args['--debug'])
  File "/usr/local/lib/python3.6/dist-packages/dpu_utils/utils/debughelper.py", line 21, in run_and_debug
    func()

  File "train.py", line 191, in <lambda>
    run_and_debug(lambda: run(args), args['--debug'])
  File "train.py", line 177, in run
    parallelize=not(arguments['--sequential']))
  File "train.py", line 89, in run_train
    model_path = model.train(train_data, valid_data, azure_info_path, quiet=quiet, resume=resume)
  File "/home/dev/src/models/model.py", line 787, in train
    quiet=quiet)
  File "/home/dev/src/models/model.py", line 725, in __run_epoch_in_batches
    op_results = self.__sess.run(ops_to_run, feed_dict=batch_data_dict)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 929, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1328, in _do_run
    run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py",wandb: Waiting for W&B process to finish, PID 41
 line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: len(seq_lens) != input.dims(0), (185 vs. 0)
         [[node code_encoder/python/elmo_encoder/module_apply_tokens/bilm/ReverseSequence (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_hub/native_module.py:561)  = ReverseSequence[T=DT_FLOAT, Tlen=DT_INT32, batch_dim=0, seq_dim=1, _device="/job:localhost/replica:0/task:0/device:GPU:0"](code_encoder/python/elmo_encoder/module_apply_tokens/bilm/Reshape_1, _arg_code_encoder/python/elmo_encoder/tokens_lengths_0_18/_1397)]]
         [[{{node mul_92/_2173}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_25634_mul_92", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Caused by op 'code_encoder/python/elmo_encoder/module_apply_tokens/bilm/ReverseSequence', defined at:
  File "train.py", line 191, in <module>
    run_and_debug(lambda: run(args), args['--debug'])
  File "/usr/local/lib/python3.6/dist-packages/dpu_utils/utils/debughelper.py", line 21, in run_and_debug
    func()
  File "train.py", line 191, in <lambda>
    run_and_debug(lambda: run(args), args['--debug'])
  File "train.py", line 177, in run
    parallelize=not(arguments['--sequential']))
  File "train.py", line 72, in run_train
    model.make_model(is_train=True)
  File "/home/dev/src/models/model.py", line 231, in make_model
    self._make_model(is_train=is_train)
  File "/home/dev/src/models/model.py", line 260, in _make_model
    language_encoders.append(self.__code_encoders[language].make_model(is_train=is_train))
  File "/home/dev/src/encoders/elmo_encoder.py", line 68, in make_model
    as_dict=True
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_hub/module.py", line 255, in __call__
    name=name)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_hub/native_module.py", line 561, in create_apply_graph
    import_scope=relative_scope_name)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py", line 1674, in import_meta_graph
    meta_graph_or_file, clear_devices, import_scope, **kwargs)[0]
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py", line 1696, in _import_meta_graph_with_return_elements
    **kwargs))
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/meta_graph.py", line 806, in import_scoped_meta_graph_with_return_elements
    return_elements=return_elements)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/importer.py", line 442, in import_graph_def
    _ProcessNewOps(graph)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/importer.py", line 234, in _ProcessNewOps
    for new_op in graph._add_new_tf_operations(compute_devices=False):  # pylint: disable=protected-access
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3440, in _add_new_tf_operations
    for c_op in c_api_util.new_tf_operations(self)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3440, in <listcomp>
    for c_op in c_api_util.new_tf_operations(self)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3299, in _create_op_from_tf_operation
    ret = Operation(c_op, self)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1770, in __init__
    self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): len(seq_lens) != input.dims(0), (185 vs. 0)
         [[node code_encoder/python/elmo_encoder/module_apply_tokens/bilm/ReverseSequence (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_hub/native_module.py:561)  = ReverseSequence[T=DT_FLOAT, Tlen=DT_INT32, batch_dim=0, seq_dim=1, _device="/job:localhost/replica:0/task:0/device:GPU:0"](code_encoder/python/elmo_encoder/module_apply_tokens/bilm/Reshape_1, _arg_code_encoder/python/elmo_encoder/tokens_lengths_0_18/_1397)]]
         [[{{node mul_92/_2173}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_25634_mul_92", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]