farzingkh / EigenFlow

C++ API to create Neural Nets using Eigen library
MIT License
11 stars 1 forks source link
computational-graphs eigen-library neural-network

EigenFlow

This project was inspired by Deep Learning From Scratch: Theory and Implementation blog post by Daniel Sabinasz. This project implements an API to create computational graphs and neural nets using Eigen libraray in C++ from scratch.

Currently, it only includes a gradient descent optimizer to minimize any user defined loss function. It currently uses dynamicly sized matrices everywhere. However, for large matrices the overhead might be reasonable and still efficient according to Eigen library documentation.

This project was created with educational purposes in mind and does not offer the most efficient way of implementing neural nets. It is best suited for people who are begineers in Machine Learning and Deep Learning. Re-implementation of different algorithms from scratch will help you increase your understanding.

Dependencies for Running

Basic Build Instructions

  1. Clone this repo.
  2. Make a build directory in the top level directory: mkdir build && cd build
  3. Compile: cmake .. && make
  4. Run it: ./nn.

File Structure

Classes are template, but they are separated into ".h" and ".tpp" files to increase code readability. Relevant .tpp files are included at the end of each class declaration.

.
├── CMakeLists.txt
├── LICENSE
├── README.md
├── data
│   ├── dev
│   └── test
├── image
│   ├── out1.gif
│   └── out2.gif
├── include
│   ├── NN.h
│   ├── gradientDescentOptimizer.h
│   ├── graph.h
│   ├── lockingPtr.h
│   ├── node.h
│   ├── operation.h
│   ├── optimization.h
│   └── session.h
└── src
    ├── NN.tpp
    ├── gradientDescentOptimizer.tpp
    ├── graph.tpp
    ├── lockingPtr.tpp
    ├── main.cpp
    ├── node.tpp
    ├── operation.tpp
    └── session.tpp

Class Structure

General information for class structure of the API. For more detailed information about the interface, see the header files.

How to Use

Explanation of the example in main.cpp

Include the NN.h in your file:

 #include "../include/NN.h" 

Create an alias for the dynamic eigen matrix type

typedef Eigen::Matrix<long double, Eigen::Dynamic, Eigen::Dynamic> matxxf;

Then initialize a neural network NN class:

NN nn = NN(); 

Define the number of steps for optimization

int const STEPS = 10000; 

Use nn.spaceholders for constants and nn.variables for learnable variables, see the main.cpp for example:

(Use long double if you want to check the gradient calculations numerically.)

// matrix of scalar 1
Eigen::Matrix<long double, 1, 1> One;
One << 1; 

// cast to dynamic matrix
matxxf n = One;
BaseNode *one = nn.placeholder<long double>("one");

// Bias (m*1)
Eigen::Matrix<long double, 1, 1> B;
B << 0.1;
BaseNode *b = nn.variable<long double>(std::move(B));

// Wieghts (nh*nx)
Eigen::Matrix<long double, 1, 2> W;
W << 0.1, 0.2;
BaseNode *w = nn.variable<long double>(std::move(W));

// Training Data (nx*m)
Eigen::Matrix<long double, 2, 1> X;
X << 3, 2;
// cast to dynamic matrix
matxxf x = X;

// Labels (1*m)
Eigen::Matrix<long double, 1, 1> Y;
Y << 1;
// cast to dynamic matrix
matxxf yy = Y;
BaseNode *y = nn.placeholder<long double>("Y");

Create the activation function:

// activation unit sigmoid(w^T*x+b) (nh*m)
BaseNode *a = nn.sigmoid<long double>(nn.add<long double>(nn.matmultiply<long double>(w, nn.placeholder<long double>("X")), b));

Create the loss function:

// intermidiate loss function
// create loss function -(y*log(a)+(1-y)*log(1-a))
BaseNode *L = nn.negative<long double>(nn.add<long double>(nn.matmultiply<long double>(y, nn.log<long double>(a)), nn.matmultiply<long double>(nn.add<long double>(one, nn.negative<long double>(y)), nn.log<long double>(nn.add<long double>(one, nn.negative<long double>(a))))));

Create optimization operation:

 // Create gradient descent optimization
    auto opt = GradientDescentOptimizer(0.01).minimize<matxxf>(L);

Create an unordered_map to feed the data for placeholders:

// Create a map to feed data to the placeholders (i.e. "X" = X)
    std::unordered_map<std::string, matxxf *>
        feed = {};
    feed["X"] = &x;
    feed["Y"] = &yy;
    feed["one"] = &n;

Use nn.run to run the operations:

// Run  operation
nn.run<long double>(L, feed);

Create a loop for optimizaiton and run optimization oprations

for (int i = 1; i < STEPS; i++)
    {
        nn.run<long double>(&opt, feed);
        nn.run<long double>(L, feed);
        if (i % 1000 == 0)
        {
            std::cout << "Step " << i << std::endl;
            std::cout << "Activation: " << *(a->getValue<matxxf>()) << std::endl;
            std::cout << "loss: " << *(L->getValue<matxxf>()) << std::endl;
            std::cout << "Weights: " << *(w->getValue<matxxf>()) << std::endl;
            std::cout << "Bias: " << *(b->getValue<matxxf>()) << std::endl;
        }
    }

Use nn.checkAllGradient() to see if the gradient calculations are correct. It compares the gradients with numerically obtained values. For best results make sure the learning rate is set to zero. See the implementaiton for further information:

 /* Check gradients -- Make sure to set learning rate to zero befor checking!! -- */
    nn.checkAllGradient(L, feed);

Expected Output

alt text

References and Useful Resources

1 - To learn about the matrix calculus required for Neural Nets see explaned.ai by Terence Parr and Jeremy Howard

2 - To learn more about Deep Learning read the book "Deep Learning" by Ian Goodfellow and Yoshua Bengio and Aaron Courville

3 - Deep Learning From Scratch: Theory and Implementation blog post by Daniel Sabinasz

4 - To learn more about Eigen library see the documentation

5 - To learn the basics of C++ and get started see Udacity Cpp ND