oxford-cs-ml-2015 / practical6

Practical 6: LSTM language models
https://www.cs.ox.ac.uk/people/nando.defreitas/machinelearning/
260 stars 82 forks source link

CharLMMinibatchLoader.lua, Line 39 Question #7

Open techutechu opened 8 years ago

techutechu commented 8 years ago

I have a question about Line 39: ydata[-1] = data[1]

The key here is to shift x's (input characters) by one character forward to get y's (target characters we want to predict based on x's given). So Line 38 makes perfect sense to me:

ydata:sub(1,-2):copy(data:sub(2,-1)).

However, why do we want to assign the very first character in the given text to the last element of ydata? We certainly do not want to predict the first character of the text, right? It would make sense if the final element of ydata was instead the actual next character in the text in case our number of characters is not divisible by (seq_length times batch_size). But it looks that the code just cuts off the remaining characters if any.

Could anyone help me to make sense of this? Maybe I am completely misunderstanding the code. Thank you in advance.