novak-99 / MLPP

A library created to revitalize C++ as a machine learning front end. Per aspera ad astra.
MIT License
1.08k stars 155 forks source link

Is MLPP reinventing the wheel? What would it be used for? #6

Open towermitten opened 2 years ago

towermitten commented 2 years ago

Great work! Very interesting!

In the README, you say that MLPP serves to revitalize C++ as a machine learning front-end. How does MLPP separate itself from the Pytorch C++ API? If you don't mind me asking, why not build wrappers around the already open-source and highly optimized Pytorch C++ code?

Thanks!

hungnphan commented 2 years ago

In the README, he also introduced

The intent with this library is for it to act as a crossroad between low-level developers and machine learning engineers.

Personally, I think that the author makes this repo to reconstruct ML algorithms, beyond NeuralNet stuffs. The key thing is that he explicitly reimplemented on pure C++, while LibTorch (PyTorch C++ API) is partially inherited Caffee framework and is extended with apex repo from NVIDIA (a bunch of dependencies).

Anyway, this is a great work and interesting for both education and research purposes towards C++ programmers and enthusiasts.

towermitten commented 2 years ago

Thank you for your thoughts @hungnphan, however my intent with this issue is not to poll people's opinion on MLPP but instead to understand why MLPP was not built on top of Pytorch C++.

Your relevant point is that Pytorch C++ is built on top of external libraries which is a purported drawback. However, Pytorch C++ dependencies can be installed directly through conda without root privileges, so it's a non-issue.

novak-99 commented 2 years ago

Hello @towermitten,

Thank you for the kind sentiment. Regarding the distinction between PyTorch’s C++ API and ML++, the main differences lie in user support and syntax.

In terms of support for C++, PyTorch has glaring issues- various main functions existent in the Python API are absent, the documentation is lacking and does not contain as many examples, and not many people are willing to contribute. This of course is understandable as machine learning is primarily dominated by Python. ML++, meanwhile, is specifically designed to cater to C++ developers and low level engineers, and so emphasis will be given to C++ as opposed to other languages, such as Python.

Secondly, another significant drawback present in the C++ API in PyTorch is its complexity. The following is a standard end-to-end implementation of an artificial neural network in the PyTorch C++ API, via https://pytorch.org/cppdocs/frontend.html:

#include <torch/torch.h>

// Define a new Module.
struct Net : torch::nn::Module {
  Net() {
    // Construct and register two Linear submodules.
    fc1 = register_module("fc1", torch::nn::Linear(784, 64));
    fc2 = register_module("fc2", torch::nn::Linear(64, 32));
    fc3 = register_module("fc3", torch::nn::Linear(32, 10));
  }

  // Implement the Net's algorithm.
  torch::Tensor forward(torch::Tensor x) {
    // Use one of many tensor manipulation functions.
    x = torch::relu(fc1->forward(x.reshape({x.size(0), 784})));
    x = torch::dropout(x, /*p=*/0.5, /*train=*/is_training());
    x = torch::relu(fc2->forward(x));
    x = torch::log_softmax(fc3->forward(x), /*dim=*/1);
    return x;
  }

  // Use one of many "standard library" modules.
  torch::nn::Linear fc1{nullptr}, fc2{nullptr}, fc3{nullptr};
};

int main() {
  // Create a new Net.
  auto net = std::make_shared<Net>();

  // Create a multi-threaded data loader for the MNIST dataset.
  auto data_loader = torch::data::make_data_loader(
      torch::data::datasets::MNIST("./data").map(
          torch::data::transforms::Stack<>()),
      /*batch_size=*/64);

  // Instantiate an SGD optimization algorithm to update our Net's parameters.
  torch::optim::SGD optimizer(net->parameters(), /*lr=*/0.01);

  for (size_t epoch = 1; epoch <= 10; ++epoch) {
    size_t batch_index = 0;
    // Iterate the data loader to yield batches from the dataset.
    for (auto& batch : *data_loader) {
      // Reset gradients.
      optimizer.zero_grad();
      // Execute the model on the input data.
      torch::Tensor prediction = net->forward(batch.data);
      // Compute a loss value to judge the prediction of our model.
      torch::Tensor loss = torch::nll_loss(prediction, batch.target);
      // Compute gradients of the loss w.r.t. the parameters of our model.
      loss.backward();
      // Update the parameters based on the calculated gradients.
      optimizer.step();
      // Output the loss and checkpoint every 100 batches.
      if (++batch_index % 100 == 0) {
        std::cout << "Epoch: " << epoch << " | Batch: " << batch_index
                  << " | Loss: " << loss.item<float>() << std::endl;
        // Serialize your model periodically as a checkpoint.
        torch::save(net, "net.pt");
      }
    }
  }
}

The following is a very similar implementation, except in the ML++ framework:

int main() {
    MANN ann(inputSet, outputSet); // Input size implicitly 784- final output size implicitly 10
    ann.addLayer(64, “RELU”); // Hidden unit number, activation function
    ann.addLayer(32, “RELU”);

    ann.addOutputLayer(“Softmax , “CrossEntropy”);

    ann.MBGD(0.01, 10, 64, 1); // LR, epoch number, batch size, UI panel
}

As can be seen, implementing ML algorithms for C++ in already existing APIs is time consuming and verbose (given that the 2nd version was a lot shorter). ML++ tries to alleviate this by offering simpler implementations.

Regarding your second question on why I don’t simply utilize PyTorch implementations, as @hungnphan has pointed out, using tools such as PyTorch supported matrix operations, neural network packages etc. would abstract a lot of the key details of various machine learning algorithms, and I was interested in getting into the particulars of them.

Because of this, the speed of execution is an issue, and to try to ameliorate this, I now am looking into vectorization strategies.

towermitten commented 2 years ago

Lack of support in Pytorch C++

Some of the issues you raise about Pytorch C++ are:

I agree with all of these specific points, but I feel that all of them could be just as easily addressed by building wrappers around Pytorch C++ all the while not sacrificing performance. This is exactly what Pytorch Lightning does (for Python).

Complicated Syntax of Pytorch C++

I don't believe your code-level comparison is very fair. The Pytorch C++ code does more than yours does:

It is true that many of the other lines rarely change, even in (Python) Pytorch, which is why the wrapper library Pytorch Lightning was built.

Comments

Also regarding your last comment

the speed of execution is an issue, and to try to ameliorate this, I now am looking into vectorization strategies.

With all due respect, this is a massive understatement. Vectorizing your code, even if you use multithreading, will not begin to approach Pytorch C++'s performance on GPU. The only way to do this is to write CUDA kernels (which is doable, you could extend MLPP in this way). Also, Pytorch C++ can be distributed across multiple GPUs using a single line of code, or across multiple processes in a distributed setting. Deep learning beyond educational toy examples will require GPU utilization.

Also, you say

using tools such as PyTorch supported matrix operations, neural network packages etc. would abstract a lot of the key details of various machine learning algorithms, and I was interested in getting into the particulars of them.

That's reasonable, but MLPP purports to be focused specifically on revitalizing C++ as a machine learning front end. So the details/particulars of the machine learning algorithms shouldn't be relevant.