var labelsVar = Variable.InputVariable(net.Output.Shape, DataType.Double, "labelsVariable", new Axis[] { Axis.DefaultBatchAxis() });
var trainer = Trainer.CreateTrainer(
net,
CNTKLib.SquaredError(net, labelsVar),
CNTKLib.SquaredError(net, labelsVar),
new Learner[] { CNTKLib.AdamLearner(new ParameterVector((System.Collections.ICollection)net.Parameters()), new TrainingParameterScheduleDouble(1, Learner.IgnoredMinibatchSize), new TrainingParameterScheduleDouble(0.971), true) }
);
double minLoss = double.MaxValue;
for (int i = 1000; i-- > 0;)
{
#pragma warning disable 618
trainer.TrainMinibatch(new Dictionary<Variable, Value>() {
{
inputVar, Value.CreateBatch(new int[] { 1 }, new double[] { 0,1,2,3,4,5 }, DeviceDescriptor.CPUDevice)
},
{
labelsVar, Value.CreateBatch(new int[] { 1 }, new double[] { 2,3,5,8,12,17 }, DeviceDescriptor.CPUDevice)
}
}, DeviceDescriptor.CPUDevice);
#pragma warning restore 618
var loss = trainer.PreviousMinibatchLossAverage();
minLoss = Math.Min(minLoss, loss);
}
The choice of the method to create value Value.CreateBatch() is due to the few examples of lstm training from the internet that were found.
But it seems like recurrence engine doesn't apply and all of the inputVar values 0,1,2,3,4,5 are considered as new sequences each with own hidden/previous state (which is 0 for all of them).
In the inference stage to enable the recurrence engine we should use the Value.CreateSequence for the input values...
... because using Value.CreateBatch leads to considering the input vector as several separate sequences with length 1.
But using method Value.CreateSequence in train stage cause an exception.
What the correct way to train recurrent net from scratch (without minibatchSource) to ensure that input values considered as one sequence?
I'm trying to implement recurrent net "from scratch". For this purpose the most simple net was prepared:
Then i'm trying to approximate the next equation (Output = (State = Input + PrevState) + 2):
The choice of the method to create value Value.CreateBatch() is due to the few examples of lstm training from the internet that were found. But it seems like recurrence engine doesn't apply and all of the inputVar values 0,1,2,3,4,5 are considered as new sequences each with own hidden/previous state (which is 0 for all of them). In the inference stage to enable the recurrence engine we should use the Value.CreateSequence for the input values...
... because using Value.CreateBatch leads to considering the input vector as several separate sequences with length 1. But using method Value.CreateSequence in train stage cause an exception.
What the correct way to train recurrent net from scratch (without minibatchSource) to ensure that input values considered as one sequence?