Open giancds opened 6 years ago
I also faced same issue with fine-tuning. Any solution to this problem?
Well, I changed the code to include a new parameter called decode
set as False
by default. If it is passed as True
in the finetuning script the missing line is called.
Thank you. Works for me :+1:
Hello guys, first thanks for sharing your code with us.
I have noticed a problem when running the fine-tune process as I'm getting an error
RuntimeError: invalid argument 2: size '[-1 x 10000]' is invalid for input with 227500 elements at /pytorch/aten/src/TH/THStorage.c:37
and it happens when
output_flat = output.view(-1, ntokens)
is called in the
evaluate
function of finetune.py.After some investigation, I have found that the call for
decoded = self.decoder(output.view(output.size(0)*output.size(1), output.size(2)))
have been dropped from
model.py
I understand that the
SplitCrossEntropyLoss
is doing this step for us on training but, given the fine tune is done with regular cross entropy, shouldn't we include this line back in the code?My apologies if I'm missing something!