microsoft / CNTK

Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit
https://docs.microsoft.com/cognitive-toolkit/
Other
17.51k stars 4.28k forks source link

How would a layer add itself to the computational graph structure in CNTK? #2774

Open woskii opened 6 years ago

woskii commented 6 years ago

In the python API, when create a class Trainer instance, we only need to put the output layer operation into it. I saw an article said that the prior layers will be trained using the computational graph structure. But I can't find where the prior layers add themselves into the graph in the source code. Can someone tell me? Any help would be appreciated.

frankibem commented 6 years ago

The output of a node in the computational graph is dependent on what operation it represents and what inputs it was given. These inputs can be constants, input variables (raw data) or the outputs of previous nodes/layers and so on. Trainer takes the model, loss function, metric... which are all defined in terms of other nodes/layers. You build a graph as you define operations. Here's an example:

import cntk
from cntk.layers import Sequential, Dense

x = cntk.input_variable((2, ), name='x')
y = cntk.input_variable((1, ), name='y')

z = Sequential([
    Dense(3, name='dense_1'),
    Dense(1, name='dense_2')
])(x)

loss = cntk.squared_error(z, y, name='loss')
graph = cntk.logging.plot(loss, 'graph.png')

graph

The arrows in the plot show the flow of information. W and b are the weights and bias of the dense layer. The input, x, is used to compute the output of the first dense layer whose output is used to compute the output of the second dense layer (model output, z) whose output is in turn used to compute the loss as the squared difference between the target, y, and z. Combined with a specified learner, gradients will be computed and correctly propagated backwards to the appropriate nodes.