Closed sjakati98 closed 4 years ago
Tokenizing and building vocabulary for code snippets and queries. This step may take several hours.
2019-11-28 00:36:44.687525: W tensorflow/core/graph/graph_constructor.cc:1265] Importing a graph with a lower producer version 26 into an existing graph with producer version 27. Shape inference will have run different parts of the graph with different producer versions.2019-11-28 00:36:44.687525: W tensorflow/core/graph/graph_constructor.cc:1265] Importing a graph with a lower producer version 26 into an existing graph with producer version 27. Shape inference will have run different parts of the graph with different producer versions.Starting training run elmo-2019-11-28-00-32-43 of model ElmoModel with following hypers:
{'code_token_vocab_size': 10000, 'code_token_vocab_count_threshold': 10, 'code_token_embedding_size': 128, 'code_use_subtokens': False, 'code_mark_subtoken_end': True, 'code_max_num_tokens': 200, 'code_use_bpe': True, 'code_pct_bpe': 0.5, 'code_embedding_type': 'elmo', 'code_elmo_pool_mode': 'weighted_mean', 'query_token_vocab_size': 10000, 'query_token_vocab_count_threshold': 10, 'query_token_embedding_size': 128, 'query_use_subtokens': False, 'query_mark_subtoken_end': False, 'query_max_num_tokens': 30, 'query_use_bpe': True, 'query_pct_bpe': 0.5, 'query_embedding_type': 'elmo', 'query_elmo_pool_mode': 'weighted_mean', 'batch_size': 1000, 'optimizer': 'Adam', 'seed': 0, 'dropout_keep_rate': 0.9, 'learning_rate': 0.01, 'learning_rate_code_scale_factor': 1.0, 'learning_rate_query_scale_factor': 1.0, 'learning_rate_decay': 0.98, 'momentum': 0.85, 'gradient_clip': 1, 'loss': 'softmax', 'margin': 1, 'max_epochs': 2, 'patience': 5, 'fraction_using_func_name': 0.1, 'min_len_func_name_for_query': 12, 'query_random_token_frequency': 0.0}
Loading training and validation data.
Begin Training.
Begin Training.
Training on 30000 go, 30000 java, 30000 javascript, 30000 php, 30000 python, 30000 ruby samples.
Validating on 8253 javascript, 26015 php, 2209 ruby, 23107 python, 15328 java, 14242 go samples.
==== Epoch 0 ====
Traceback (most recent call last):amples). Processed 0 samples. Loss so far: 0.0000. MRR so far: 0.0000
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: len(seq_lens) != input.dims(0), (185 vs. 0)
[[{{node code_encoder/python/elmo_encoder/module_apply_tokens/bilm/ReverseSequence}} = ReverseSequence[T=DT_FLOAT, Tlen=DT_INT32, batch_dim=0, seq_dim=1, _device="/job:localhost/replica:0/task:0/device:GPU:0"](code_encoder/python/elmo_encoder/module_apply_tokens/bilm/Reshape_1, _arg_code_encoder/python/elmo_encoder/tokens_lengths_0_18/_1397)]]
[[{{node mul_92/_2173}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_25634_mul_92", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "train.py", line 191, in <module>
run_and_debug(lambda: run(args), args['--debug'])
File "/usr/local/lib/python3.6/dist-packages/dpu_utils/utils/debughelper.py", line 21, in run_and_debug
func()
File "train.py", line 191, in <lambda>
run_and_debug(lambda: run(args), args['--debug'])
File "train.py", line 177, in run
parallelize=not(arguments['--sequential']))
File "train.py", line 89, in run_train
model_path = model.train(train_data, valid_data, azure_info_path, quiet=quiet, resume=resume)
File "/home/dev/src/models/model.py", line 787, in train
quiet=quiet)
File "/home/dev/src/models/model.py", line 725, in __run_epoch_in_batches
op_results = self.__sess.run(ops_to_run, feed_dict=batch_data_dict)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py",wandb: Waiting for W&B process to finish, PID 41
line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: len(seq_lens) != input.dims(0), (185 vs. 0)
[[node code_encoder/python/elmo_encoder/module_apply_tokens/bilm/ReverseSequence (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_hub/native_module.py:561) = ReverseSequence[T=DT_FLOAT, Tlen=DT_INT32, batch_dim=0, seq_dim=1, _device="/job:localhost/replica:0/task:0/device:GPU:0"](code_encoder/python/elmo_encoder/module_apply_tokens/bilm/Reshape_1, _arg_code_encoder/python/elmo_encoder/tokens_lengths_0_18/_1397)]]
[[{{node mul_92/_2173}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_25634_mul_92", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Caused by op 'code_encoder/python/elmo_encoder/module_apply_tokens/bilm/ReverseSequence', defined at:
File "train.py", line 191, in <module>
run_and_debug(lambda: run(args), args['--debug'])
File "/usr/local/lib/python3.6/dist-packages/dpu_utils/utils/debughelper.py", line 21, in run_and_debug
func()
File "train.py", line 191, in <lambda>
run_and_debug(lambda: run(args), args['--debug'])
File "train.py", line 177, in run
parallelize=not(arguments['--sequential']))
File "train.py", line 72, in run_train
model.make_model(is_train=True)
File "/home/dev/src/models/model.py", line 231, in make_model
self._make_model(is_train=is_train)
File "/home/dev/src/models/model.py", line 260, in _make_model
language_encoders.append(self.__code_encoders[language].make_model(is_train=is_train))
File "/home/dev/src/encoders/elmo_encoder.py", line 68, in make_model
as_dict=True
File "/usr/local/lib/python3.6/dist-packages/tensorflow_hub/module.py", line 255, in __call__
name=name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_hub/native_module.py", line 561, in create_apply_graph
import_scope=relative_scope_name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py", line 1674, in import_meta_graph
meta_graph_or_file, clear_devices, import_scope, **kwargs)[0]
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py", line 1696, in _import_meta_graph_with_return_elements
**kwargs))
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/meta_graph.py", line 806, in import_scoped_meta_graph_with_return_elements
return_elements=return_elements)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/importer.py", line 442, in import_graph_def
_ProcessNewOps(graph)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/importer.py", line 234, in _ProcessNewOps
for new_op in graph._add_new_tf_operations(compute_devices=False): # pylint: disable=protected-access
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3440, in _add_new_tf_operations
for c_op in c_api_util.new_tf_operations(self)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3440, in <listcomp>
for c_op in c_api_util.new_tf_operations(self)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3299, in _create_op_from_tf_operation
ret = Operation(c_op, self)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1770, in __init__
self._traceback = tf_stack.extract_stack()
InvalidArgumentError (see above for traceback): len(seq_lens) != input.dims(0), (185 vs. 0)
[[node code_encoder/python/elmo_encoder/module_apply_tokens/bilm/ReverseSequence (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_hub/native_module.py:561) = ReverseSequence[T=DT_FLOAT, Tlen=DT_INT32, batch_dim=0, seq_dim=1, _device="/job:localhost/replica:0/task:0/device:GPU:0"](code_encoder/python/elmo_encoder/module_apply_tokens/bilm/Reshape_1, _arg_code_encoder/python/elmo_encoder/tokens_lengths_0_18/_1397)]]
[[{{node mul_92/_2173}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_25634_mul_92", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Added the model. Not sure about the batch sizings.