xdata-skylark / libskylark

Sketching-based Distributed Matrix Computations for Machine Learning
Other
98 stars 20 forks source link

MNIST C++ example #41

Closed positiveblue closed 7 years ago

positiveblue commented 7 years ago

Hello,

I have just finished the MNIST ipython notebook. I will update it tomorrow to the official repository to let you check it. However, I have used numpy for the calculation of least_squares. I tried using the skylark version but seems that is not working.

I have been checking if it was the python binding or the C++ implementation and seems to be the last one. The function

skylark::nla::ApproximateLeastSquares();

solve the min |Ax-b| with x full of nan.

I code it using the same approach we use in the ipython notebook. Here the code to let you reproduce it. (call the program with the path to the mnist dataset) ./mnist /path/to/mnist.scale /path/to/mnist.scale.t

#include <iostream>
#include <El.hpp>
#include <skylark.hpp>
#include <boost/mpi.hpp>

int main(int argc, char* argv[]) {

    if ( argc != 3 ) {
        std::cout << "You have to specify the path to the files" << std::endl;
        exit(1);
    }

    // Config parameters
    int seed = 38734;
    std::string fname = argv[1];
    std::string testfname = argv[2];

    // Initialize
    El::Initialize(argc, argv);

    boost::mpi::communicator world;
    int rank = world.rank();

    skylark::base::context_t context(seed);

    El::DistMatrix<double> A, b;

    // Read the file
    std::cout << "Reading the training file " << std::endl;
    skylark::utility::io::ReadLIBSVM(fname, A, b, skylark::base::ROWS,
             784);

    std::cout << "A: " << A.Height() << ' ' << A.Width() << std::endl;
    std::cout << "b: " << b.Height() << ' ' << b.Width() << std::endl;

    std::cout << "Extending b... " << std::endl;

    El::DistMatrix<double> eb (b.Height(), 10);
    El::Fill(eb, 0.0);

    for (int i = 0; i < b.Height(); ++i) {
        eb.Set(i, b.Get(i,0), 1);
    }

    std::cout << "eb: " << eb.Height() << ' ' << eb.Width() << std::endl;

    El::DistMatrix<double> W (784, 10);
    std::cout << "Solving min || AX - b || " << std::endl;
    skylark::nla::ApproximateLeastSquares(El::NORMAL, A, eb, W, context);

    std::cout << "W: " << W.Height() << ' ' << W.Width() << std::endl;

    for (int i = 0; i < 10; ++i) {
        for (int j = 0; j < 5; ++j) {
           std::cout << W.Get(i,j) << ' ';
        }
        std::cout << std::endl;
    }

    El::Finalize();

    return 0;

}
positiveblue commented 7 years ago

BTW, using

skylark::nla::FasterLeastSquares(El::NORMAL, A, eb, W, context);

still giving me "FAILED to create!" .

I could change some parameters to do more iterations (right now are 3) but I am not sure if it is the problem or not.

haimav commented 7 years ago

Note that we have: https://github.com/xdata-skylark/libskylark/blob/development/ml/rlsc.hpp#L95

positiveblue commented 7 years ago

I will try the same example using rlsc after binding Approximate and Faster KernelRidge. I will post here the results.

positiveblue commented 7 years ago

After bound Approximate and Faster KernelRidge, I checked and all is working.