Hi and Thanks for your code,
I would like to ask. You are obtaining the max of the tensor predictions1 = torch.zeros(batch_size, max(decode_lengths), vocab_size) along the timesteps and not along the words (you are not predicting the max word from the softmax output as in the predictionstensor). You are doing: scores_d = scores_d.max(1)[0]. So you are getting the max across timesteps for each vocabulary word. Is that supposed to be correct? THanks!
Hi and Thanks for your code, I would like to ask. You are obtaining the max of the tensor
predictions1 = torch.zeros(batch_size, max(decode_lengths), vocab_size)
along the timesteps and not along the words (you are not predicting the max word from the softmax output as in thepredictions
tensor). You are doing:scores_d = scores_d.max(1)[0]
. So you are getting the max across timesteps for each vocabulary word. Is that supposed to be correct? THanks!