sherjilozair / char-rnn-tensorflow

Multi-layer Recurrent Neural Networks (LSTM, RNN) for character-level language models in Python using Tensorflow
MIT License
2.64k stars 960 forks source link

FIX: we do not need the vocab_size here #77

Closed madrugado closed 7 years ago

madrugado commented 7 years ago

The vocab_size isn't used since 0.8, even more average_on_timesteps is set by this parameter, which is simply wrong

ubergarm commented 7 years ago

I agree args.vocab_size should not be passed According to Tensorflow API r1.0:

tf.contrib.legacy_seq2seq.sequence_loss_by_example(logits, targets, weights, average_across_timesteps=True, softmax_loss_function=None, name=None)

It looks like that 4th paramter is now a boolean average_across_timesteps: If set, divide the returned cost by the total label weight.. So while it is a bug, the code will still run.

@sherjilozair I plan to merge this one.

ubergarm commented 7 years ago

Thanks @madrugado, I see you have another patch branch. Hopefully I'll get to it soon. Trying to help tidy up this repo a bit. Feel free to join https://gitter.im/char-rnn-tensorflow/Lobby to say hello if you'd like.