tensorflow / skflow

Simplified interface for TensorFlow (mimicking Scikit Learn) for Deep Learning
Apache License 2.0
3.18k stars 439 forks source link

neural_translation_word.py in examples failed after a few iterations. #118

Closed ohadzad closed 8 years ago

ohadzad commented 8 years ago

When adding words to vocabulary, They being added with a number as mapping. when calling:

X_vocab_processor.fit(Xtrainff)
y_vocab_processor.fit(ytrainff)

all words with a frequency of under min_frequency are being trimmed. but - their mapping are still the same as the previous ids. meaning some words will have a mapping bigger than the length of the vocabulary.

in this line: out.itemset(tuple([i, idx, value]), 1.0)

it tries to put a value bigger than allowed resulting in:

...
English: We must support initiatives increasing the availability and improving the. French (pred): Nous C'est s'articulent s'articulent s'articulent s'articulent s'articulent s'articulent s'articulent s'articulent s'articulent, French (gold): Nous devons soutenir les initiatives visant à développer les ressources en eau et à améliorer la distribution et la gestion de ce produit très peu abondant dans la région.
[ 401  868  853 1003 1093    3 6961   16 1873    3] [  387   198 41289 41289 41289 41289 41289 41289 41289 41289 41289]
Traceback (most recent call last):
  File "neural_translation_word.py", line 153, in <module>
    translator.fit(X_train, y_train, logdir=PATH)
  File "/usr/local/lib/python2.7/dist-packages/skflow/estimators/base.py", line 235, in fit
    feed_params_fn=self._data_feeder.get_feed_params)
  File "/usr/local/lib/python2.7/dist-packages/skflow/trainer.py", line 114, in train
    feed_dict = feed_dict_fn()
  File "/usr/local/lib/python2.7/dist-packages/skflow/io/data_feeder.py", line 308, in _feed_dict_fn
    out.itemset(tuple([i, idx, value]), 1.0)
IndexError: index 47754 is out of bounds for axis 2 with size 47728

possible fix: remap all the ids to 1..max_frequency

ohadzad commented 8 years ago

log.txt