Closed pltrdy closed 7 years ago
This looks similar to #80 which was (I hope) resolved in the new release 1.0.11. Could you give the new release a try and report back? Thanks!
I pulled last results, it's not working, same error.
I just re-generated the set and re-trained a small model and I'm not seeing this:
t2t_datagen --data_dir ~/t2t_data/ --tmp_dir ~/t2t_data/tmp/ --problem=lmptb_10k
rm -rf /tmp/tensor2tensor/* && t2t_trainer --data_dir ~/t2t_data/ --problems=lmptb_10k --model=attention_lm --hparams_set=attention_lm_base --hparams='batch_size=2048,hidden_size=128,filter_size=512' --train_steps=5000 --eval_steps=10
I get outputs like that:
Inference results OUTPUT: baseball that game of the long haul is the <unk> sport of the mean and the mean <unk> law caught up with the san francisco giants in the world series last weekend are generally deliberately <unk> out to travel prices for associations <EOS> in the otc market <EOS> <EOS> and the <unk> at st. paul carlos o'connell key first country 's recent slide <EOS> once <unk> <unk> <EOS> and arm <EOS> <EOS> in the middle unwanted takeover <unk> table <EOS> <unk> and in the night <EOS> and assassination <EOS> <unk> family 's <unk> party 's best last month <EOS> called floor traders say <EOS> <unk> said <EOS> even try to make the apparent <unk> I0706 17:30:24.761709 32406 trainer_utils.py:569] Inference results OUTPUT: bulls say the market is an incredible bargain priced at only about N times estimated N earnings for stocks in the standard & poor 's N index <pad> <pad> <pad> <pad> <pad> <EOS> <EOS> more than N according to dow jones to the fraction <EOS> manuel noriega <EOS> <EOS> <EOS> <EOS> and tends to japanese investment and his own <EOS> <EOS> for a new york office <EOS> for mr. lawson <EOS> <EOS> inc <EOS> entirely as a lost its receipts affair <EOS> once he says <EOS> <EOS> <EOS> <EOS> <EOS> <EOS> to him to a security statistics <EOS> for uncertainties investment president who will probably has fallen <EOS> chief of events for his audience
Could you try the above and tell me if you see an error? What do you run to get it?
Hmm, I just did a fresh clone, ran setup then your commands and get an error:
InvalidArgumentError (see above for traceback): indices[75,22,0] = 10000 is not in [0, 10000)
[[Node: symbol_modality_10000_128/parallel_0/symbol_modality_10000_128/target_emb/Gather = Gather[Tindices=DT_INT32, Tparams=DT_FLOAT, validate_indices=true, _device="/job:localhost/replica:0/task:0/cpu:0"](symbol_modality_10000_128/parallel_0/symbol_modality_10000_128/target_emb/ConvertGradientToTensor_cc661786, symbol_modality_10000_128/parallel_0/symbol_modality_10000_128/target_emb/Squeeze)]]
For the reference I'm running this (be aware, I changed "_" to "-"):
t2t-datagen --data_dir ~/t2t_data/ --tmp_dir ~/t2t_data/tmp/ --problem=lmptb_10k
t2t-trainer --data_dir ~/t2t_data/ \
--problems=lmptb_10k \
--model=attention_lm \
--hparams_set=attention_lm_base \
--hparams='batch_size=2048,hidden_size=128,filter_size=512' \
--train_steps=5000 \
--eval_steps=10
This is very strange. How did you generate the data? I just re-ran and it's looking ok, strange. What's your python version and TF version?
Well, I tried again, pulled last commit (963730e32fe06f24e7534e550504a087d7b5591e), then python setup.py install
I'm using Python 3.6.1 and tensorflow 1.1.0 and I'm now getting:
Traceback (most recent call last):
File "/home/pltrdy/.conda/envs/tensorflow/bin/t2t-trainer", line 4, in <module>
__import__('pkg_resources').run_script('tensor2tensor==1.0.14', 't2t-trainer')
File "/home/pltrdy/.conda/envs/tensorflow/lib/python3.6/site-packages/setuptools-27.2.0-py3.6.egg/pkg_resources/__init__.py", line 744, in run_script
File "/home/pltrdy/.conda/envs/tensorflow/lib/python3.6/site-packages/setuptools-27.2.0-py3.6.egg/pkg_resources/__init__.py", line 1506, in run_script
File "/home/pltrdy/.conda/envs/tensorflow/lib/python3.6/site-packages/tensor2tensor-1.0.14-py3.6.egg/EGG-INFO/scripts/t2t-trainer", line 67, in <module>
File "/home/pltrdy/.conda/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "/home/pltrdy/.conda/envs/tensorflow/lib/python3.6/site-packages/tensor2tensor-1.0.14-py3.6.egg/EGG-INFO/scripts/t2t-trainer", line 63, in main
File "/home/pltrdy/.conda/envs/tensorflow/lib/python3.6/site-packages/tensor2tensor-1.0.14-py3.6.egg/tensor2tensor/utils/trainer_utils.py", line 266, in run
File "/home/pltrdy/.conda/envs/tensorflow/lib/python3.6/site-packages/tensor2tensor-1.0.14-py3.6.egg/tensor2tensor/utils/trainer_utils.py", line 145, in experiment_fn
File "/home/pltrdy/.conda/envs/tensorflow/lib/python3.6/site-packages/tensor2tensor-1.0.14-py3.6.egg/tensor2tensor/utils/trainer_utils.py", line 157, in create_experiment
File "/home/pltrdy/.conda/envs/tensorflow/lib/python3.6/site-packages/tensor2tensor-1.0.14-py3.6.egg/tensor2tensor/utils/trainer_utils.py", line 195, in create_experiment_components
TypeError: __init__() got an unexpected keyword argument 'session_config'
at training ..
Try tensorflow 1.2.1
Yes, my bad, still, when I'm training then deconding I'm having the same kind of error
ValueError: The shape for foldl/while/Merge_1:0 is not an invariant for the loop. It enters the loop with shape (32, 78, 1, 1), but has shape (32, 79, 1, 1) after one iteration. Provide shape invariants using either the `shape_invariants` argument of tf.while_loop or set_shape() on the loop variables.
using this kind of training/decoding: https://gist.github.com/pltrdy/8d8ce9f4dbcf1793f992a7bab358b44d
Note that before having this one have to apply the following patch:
diff --git a/tensor2tensor/data_generators/problem_hparams.py b/tensor2tensor/data_generators/problem_hparams.py
index 70b9dad..4164eb4 100644
--- a/tensor2tensor/data_generators/problem_hparams.py
+++ b/tensor2tensor/data_generators/problem_hparams.py
@@ -371,6 +371,7 @@ def lmptb_10k(model_hparams):
vocabulary = text_encoder.TokenTextEncoder(
os.path.join(model_hparams.data_dir, "lmptb_10k.vocab"))
p.vocabulary = {
+ "inputs": vocabulary,
"targets": vocabulary,
}
p.input_space_id = 3
otherwise you get:
Traceback (most recent call last):
File "/home/pltrdy/.conda/envs/tensorflow/bin/t2t-trainer", line 4, in <module>
__import__('pkg_resources').run_script('tensor2tensor==1.0.14', 't2t-trainer')
File "/home/pltrdy/.conda/envs/tensorflow/lib/python3.6/site-packages/pkg_resources/__init__.py", line 741, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/home/pltrdy/.conda/envs/tensorflow/lib/python3.6/site-packages/pkg_resources/__init__.py", line 1509, in run_script
exec(script_code, namespace, namespace)
File "/home/pltrdy/.conda/envs/tensorflow/lib/python3.6/site-packages/tensor2tensor-1.0.14-py3.6.egg/EGG-INFO/scripts/t2t-trainer", line 67, in <module>
File "/home/pltrdy/.conda/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "/home/pltrdy/.conda/envs/tensorflow/lib/python3.6/site-packages/tensor2tensor-1.0.14-py3.6.egg/EGG-INFO/scripts/t2t-trainer", line 63, in main
File "/home/pltrdy/.conda/envs/tensorflow/lib/python3.6/site-packages/tensor2tensor-1.0.14-py3.6.egg/tensor2tensor/utils/trainer_utils.py", line 266, in run
File "/home/pltrdy/.conda/envs/tensorflow/lib/python3.6/site-packages/tensor2tensor-1.0.14-py3.6.egg/tensor2tensor/utils/trainer_utils.py", line 575, in run_locally
File "/home/pltrdy/.conda/envs/tensorflow/lib/python3.6/site-packages/tensor2tensor-1.0.14-py3.6.egg/tensor2tensor/utils/trainer_utils.py", line 647, in decode_from_file
KeyError: 'inputs'
Is this still a problem? We support TensorFlow 1.3 and the latest tensor2tensor is 1.2.4. Please reopen (and please provide Python, TensorFlow, and Tensor2Tensor versions as well as command-lines and outputs) if you are still having this issue as we've been unable to reproduce.
@lukaszkaiser "I just re-generated the set and re-trained a small model and I'm not seeing this..." Do you mind sharing what t2t-decoder command you used to generate the output?
may I ask what is the inference input? Since in ptb problem self._has_input=False, then how to generate that result in the picture,thank you.
Hi,
Working on the PTB benchmark for language modeling (see https://github.com/tensorflow/tensor2tensor/pull/59) I wrote a little script for this use case (go to script (gist)).
I trained the model, then, when decoding I get the following error:
I'm not sure how to fix it since I'm not really comfortable with
trainer_utils.py
andt2t_model.py
functioning.