Open vhuytdt opened 6 years ago
@vhuytdt Can you send me the code with changes you did? It's hard to pinpoint this with just the error
hey @thushv89 . i fixed it. because i use word embedding don't same vocabulary size ( dimesion ) with encoder and decoder. Tks u about Article, NMT with tensorflow. <3. I do w2vect fowllow http://adventuresinmachinelearning.com/word2vec-tutorial-tensorflow/. if i use pre-train word embedding, should i do wirte revert-dictionary? will use it in encoder and decoder ?. My data: IWSLT'15 English-Vietnamese data [Small] Train (133K sentence pairs): [https://nlp.stanford.edu/projects/nmt/data/iwslt15.en-vi/train.en] encoder [https://nlp.stanford.edu/projects/nmt/data/iwslt15.en-vi/train.vi] decoder Vocabularies (top 50K frequent words): [https://nlp.stanford.edu/projects/nmt/data/iwslt15.en-vi/vocab.en] encoder [https://nlp.stanford.edu/projects/nmt/data/iwslt15.en-vi/vocab.vi] decoder tks u @thushv89
Hi @vhuytdt ,
Are you asking if you should write the reverse dictionary if you use pre-trained embeddings? To answer that, if you use pretrained-embeddings, make sure your vocabulary in the pretrained-embeddings matches well with the vocabulary provided in the IWSLT'15. For the ones that don't match, you can either have randomly initialized vectors and jointly train it with your model or, replace those words with a special token (
Hi @thushv89 .
train.en vocabulary size don't same wwith train.vi vocabulary size.
Here :
vocab.en -> 15xxx words.
vocab.vi -> 7xxx words.
how to choose number-vocabulary-size ? => embeddings = tf.Variable( tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0))
loss of w2vec affect to encoder and decoder ?.
in present, i use tensorflow to train NMT Model about comic.
Can you give me id-skype? many thank.
i train work embedding 300 dimension with 12000 word from https://nlp.stanford.edu/projects/nmt/data/iwslt15.en-vi/. but not work.. help me tks you very much.
Started Training .
InvalidArgumentError Traceback (most recent call last) ~/tensorflow-dev/lib/python3.5/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, args) 1360 try: -> 1361 return fn(args) 1362 except errors.OpError as e:
~/tensorflow-dev/lib/python3.5/site-packages/tensorflow/python/client/session.py in _run_fn(session, feed_dict, fetch_list, target_list, options, run_metadata) 1339 return tf_session.TF_Run(session, options, feed_dict, fetch_list, -> 1340 target_list, status, run_metadata) 1341
~/tensorflow-dev/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py in exit(self, type_arg, value_arg, traceback_arg) 515 compat.as_text(c_api.TF_Message(self.status.status)), --> 516 c_api.TF_GetCode(self.status.status)) 517 # Delete the underlying status object from memory otherwise it stays alive
InvalidArgumentError: indices[5] = 12036 is not in [0, 12000) [[Node: embedding_lookup_16 = Gather[Tindices=DT_INT32, Tparams=DT_FLOAT, _class=["loc:@Const"], validate_indices=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Const, _arg_enc_train_inputs_16_0_154)]]
During handling of the above exception, another exception occurred:
InvalidArgumentError Traceback (most recent call last)