This project was inspired by Deep Learning From Scratch: Theory and Implementation blog post by Daniel Sabinasz. This project implements an API to create computational graphs and neural nets using Eigen libraray in C++ from scratch.
Currently, it only includes a gradient descent optimizer to minimize any user defined loss function. It currently uses dynamicly sized matrices everywhere. However, for large matrices the overhead might be reasonable and still efficient according to Eigen library documentation.
This project was created with educational purposes in mind and does not offer the most efficient way of implementing neural nets. It is best suited for people who are begineers in Machine Learning and Deep Learning. Re-implementation of different algorithms from scratch will help you increase your understanding.
sudo apt-get install build-essential
cmake
and make install
g++ -I /path/to/eigen/ my_program.cpp -o my_program
mkdir build && cd build
cmake .. && make
./nn
.Classes are template, but they are separated into ".h" and ".tpp" files to increase code readability. Relevant .tpp files are included at the end of each class declaration.
.
├── CMakeLists.txt
├── LICENSE
├── README.md
├── data
│ ├── dev
│ └── test
├── image
│ ├── out1.gif
│ └── out2.gif
├── include
│ ├── NN.h
│ ├── gradientDescentOptimizer.h
│ ├── graph.h
│ ├── lockingPtr.h
│ ├── node.h
│ ├── operation.h
│ ├── optimization.h
│ └── session.h
└── src
├── NN.tpp
├── gradientDescentOptimizer.tpp
├── graph.tpp
├── lockingPtr.tpp
├── main.cpp
├── node.tpp
├── operation.tpp
└── session.tpp
General information for class structure of the API. For more detailed information about the interface, see the header files.
node.h
BaseNode
: Node<T>
class is safe as long as T
is known and its the easiest to implementNode<T>
: BaseNode
. compute()
and gradient()
methods of the BaseNode
clearGrads()
of the base classVariable<T>
: Node<T>
Placeholder<T>
: Node<T>
operation.h
Operation<T>
:Node<T>
UnaryOperation<T>
:Operation<T>
BinaryOperation<T>
:Operation<T>
Add<T,T1,T2>
: Performs addition and brodcasting when necessaryMatMultiply<T,T1,T2>
: Performs matrix multiplyDot<T,T1,T2>
: Performs dot productMultiply<T,T1,T2>
: Performs element-wise productBinaryOperation<T>
Compute()
and gradient()
Negative<T>
: Performs element-wise negationLog<T>
: Performs element-wise logSigmoid<T>
: Element-wise sigmoid operationSum<T>
: Performs reduce sum on the given axisUnaryOperation<T>
Compute()
and gradient()
Minimizer<T>
:Operation<T>
Compute()
that performs gradient update graph.h
session.h
Run
method that takes a pointer to the node in the computational graph and runs relevant operations by calling compute()
for each node. Also feeds the data to placeholders. gradientDescentOptimizer.h
NN.h
graph
methodssession
lockingPtr.h
Explanation of the example in main.cpp
Include the NN.h in your file:
#include "../include/NN.h"
Create an alias for the dynamic eigen matrix type
typedef Eigen::Matrix<long double, Eigen::Dynamic, Eigen::Dynamic> matxxf;
Then initialize a neural network NN class:
NN nn = NN();
Define the number of steps for optimization
int const STEPS = 10000;
Use nn.spaceholders for constants and nn.variables for learnable variables, see the main.cpp for example:
(Use long double
if you want to check the gradient calculations numerically.)
// matrix of scalar 1
Eigen::Matrix<long double, 1, 1> One;
One << 1;
// cast to dynamic matrix
matxxf n = One;
BaseNode *one = nn.placeholder<long double>("one");
// Bias (m*1)
Eigen::Matrix<long double, 1, 1> B;
B << 0.1;
BaseNode *b = nn.variable<long double>(std::move(B));
// Wieghts (nh*nx)
Eigen::Matrix<long double, 1, 2> W;
W << 0.1, 0.2;
BaseNode *w = nn.variable<long double>(std::move(W));
// Training Data (nx*m)
Eigen::Matrix<long double, 2, 1> X;
X << 3, 2;
// cast to dynamic matrix
matxxf x = X;
// Labels (1*m)
Eigen::Matrix<long double, 1, 1> Y;
Y << 1;
// cast to dynamic matrix
matxxf yy = Y;
BaseNode *y = nn.placeholder<long double>("Y");
Create the activation function:
// activation unit sigmoid(w^T*x+b) (nh*m)
BaseNode *a = nn.sigmoid<long double>(nn.add<long double>(nn.matmultiply<long double>(w, nn.placeholder<long double>("X")), b));
Create the loss function:
// intermidiate loss function
// create loss function -(y*log(a)+(1-y)*log(1-a))
BaseNode *L = nn.negative<long double>(nn.add<long double>(nn.matmultiply<long double>(y, nn.log<long double>(a)), nn.matmultiply<long double>(nn.add<long double>(one, nn.negative<long double>(y)), nn.log<long double>(nn.add<long double>(one, nn.negative<long double>(a))))));
Create optimization operation:
// Create gradient descent optimization
auto opt = GradientDescentOptimizer(0.01).minimize<matxxf>(L);
Create an unordered_map to feed the data for placeholders:
// Create a map to feed data to the placeholders (i.e. "X" = X)
std::unordered_map<std::string, matxxf *>
feed = {};
feed["X"] = &x;
feed["Y"] = &yy;
feed["one"] = &n;
Use nn.run to run the operations:
// Run operation
nn.run<long double>(L, feed);
Create a loop for optimizaiton and run optimization oprations
for (int i = 1; i < STEPS; i++)
{
nn.run<long double>(&opt, feed);
nn.run<long double>(L, feed);
if (i % 1000 == 0)
{
std::cout << "Step " << i << std::endl;
std::cout << "Activation: " << *(a->getValue<matxxf>()) << std::endl;
std::cout << "loss: " << *(L->getValue<matxxf>()) << std::endl;
std::cout << "Weights: " << *(w->getValue<matxxf>()) << std::endl;
std::cout << "Bias: " << *(b->getValue<matxxf>()) << std::endl;
}
}
Use nn.checkAllGradient() to see if the gradient calculations are correct. It compares the gradients with numerically obtained values. For best results make sure the learning rate is set to zero. See the implementaiton for further information:
/* Check gradients -- Make sure to set learning rate to zero befor checking!! -- */
nn.checkAllGradient(L, feed);
1 - To learn about the matrix calculus required for Neural Nets see explaned.ai by Terence Parr and Jeremy Howard
2 - To learn more about Deep Learning read the book "Deep Learning" by Ian Goodfellow and Yoshua Bengio and Aaron Courville
3 - Deep Learning From Scratch: Theory and Implementation blog post by Daniel Sabinasz
4 - To learn more about Eigen library see the documentation
5 - To learn the basics of C++ and get started see Udacity Cpp ND