Open woskii opened 6 years ago
The output of a node in the computational graph is dependent on what operation it represents and what inputs it was given. These inputs can be constants, input variables (raw data) or the outputs of previous nodes/layers and so on. Trainer takes the model, loss function, metric... which are all defined in terms of other nodes/layers. You build a graph as you define operations. Here's an example:
import cntk
from cntk.layers import Sequential, Dense
x = cntk.input_variable((2, ), name='x')
y = cntk.input_variable((1, ), name='y')
z = Sequential([
Dense(3, name='dense_1'),
Dense(1, name='dense_2')
])(x)
loss = cntk.squared_error(z, y, name='loss')
graph = cntk.logging.plot(loss, 'graph.png')
The arrows in the plot show the flow of information. W and b are the weights and bias of the dense layer. The input, x, is used to compute the output of the first dense layer whose output is used to compute the output of the second dense layer (model output, z) whose output is in turn used to compute the loss as the squared difference between the target, y, and z. Combined with a specified learner, gradients will be computed and correctly propagated backwards to the appropriate nodes.
In the python API, when create a class Trainer instance, we only need to put the output layer operation into it. I saw an article said that the prior layers will be trained using the computational graph structure. But I can't find where the prior layers add themselves into the graph in the source code. Can someone tell me? Any help would be appreciated.