Open jing-xu opened 3 years ago
@jason-dai @yangw1234 @qiuxin2012 Any ideas?
how about the loss? Do you need the hidden to compute loss?
how about the loss? Do you need the hidden to compute loss?
@qiuxin2012 To my understanding, the hidden will not be directly used in compute loss, but it will contribute to the training and predicting process.
Currently both the torch_distributed backend and bigdl backend could only support
output = model(data)
type of applications. For RNN models, we needoutput, hidden = model(data, hidden)
in the training and evaluation process.