Open rendi7936 opened 7 years ago
why do you want to use libsvm on hadoop? I think it might be improper to apply kernel svm onto map reduce settings cause currently the kernel svm solver can not handle too large data sets
I want to do performance analysis in Hadoop and Spark using SVM algorithm.
If i only use less than 1 GB dataset, it is ok ? I have read many paper that Hadoop can implement SVM algorithm, but no one explain what library they use. So, i start with libSVM.
So, What should i do ? Or are there another SVM library that support map reduce programing model ?
Spark ML contains an implementation of Linear SVMs, similar to, but not as comprehensive as those in LibLINEAR. As @infwinston mentioned, SVMs with kernels, which is what LibSVM is for, are not really suited for Hadoop and Spark, since they don't scale well to large datasets, which is why you would use Hadoop/Spark. If your dataset is not large, then just use LibSVM directly.
I think you may want to check out LIBLINEAR webpage and github page. in some cases, Linear SVMs give good enough performance and get way faster than Kernel SVM.
Hello, everyone. I want to ask something. Can i use libsvm in apache hadoop ? Is it work with map reduce programming model in hadoop ?