ddbourgin / numpy-ml

Machine learning, in numpy
https://numpy-ml.readthedocs.io/
GNU General Public License v3.0
15.3k stars 3.71k forks source link

Usage to build CNN Network #14

Open WuZhuoran opened 5 years ago

WuZhuoran commented 5 years ago

Is there any documentation for usage to build a network?

I want to try to implement some simple network based on for example MNIST dataset.

If there is no documentation, i think we can write one. For example, in keras, we can have model built like this:

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(x_test, y_test))
ddbourgin commented 5 years ago

Unfortunately there really is no good high-level documentation at this point. This is on my TODO list, but is likely to take some time as there's a lot to document ;)

For your particular case, there are two examples of how you might go about building a full network in the models section.

In general, models using this code are going to be quite slow in comparison to any keras/tf/torch/theano implementations - the code here is optimized for readability over speed / efficiency. That said, I think it's a great idea to have some simple examples to show how the NN code corresponds to other packages.

ddbourgin commented 5 years ago

In general, if you want to implement a model, you'll probably want the following methods as a bare-minimum:

_build_network(self, ...):
    # initialize the network layers and store them within an 
    # OrderedDict so you can reliably iterate over them during the 
    # forward / backward passes

forward(self, X):
    # perform a forward pass. this is where the specific model architecture comes
    # into play, since you'll need to define how outputs from early layers flow to 
    # inputs of subsequent layers

backward(self, dLdy):
    # perform a backward pass. again, the route the gradients take through the network
    # will be specific to the particular model architecture
WuZhuoran commented 5 years ago

So basically numpy-ml follows some kind of PyTorch way of building a model, right?

ddbourgin commented 5 years ago

Yeah, more or less. The major difference is that this code won't have a built-in backward method - you have to implement it yourself for each model