Closed pr0m1th3as closed 3 weeks ago
Here is a self-contained example using only libsvm command-line tools from libsvm-3.33.
The data file:
fisheriris_test2.dat.gz
Running command:
./svm-train -t 2 -b 0 fisheriris_test2.dat
The output model when libsvm is compiled with
(CFLAGS = -Wall -Wconversion -O0 -msse -march=core2 -fPIC
):
fisheriris_test2.dat.model.0.gz
(CFLAGS = -Wall -Wconversion -O3 -mavx -mavx2 -march=native -fPIC
):
fisheriris_test2.dat.model.gz
The test done on Ryzen 3950X CPU, gcc version 11.4.1 20231218 (Red Hat 11.4.1-3) (GCC)
Sincerely, Dmitri.
Thanks. We will investigate this though this may take some time
The results may be unavoidable as the two compiler options lead to different orders of floating-point operations. I impose a smaller stopping tolerance and results become closer (see the values of rho and nu)
$ ./libsvm-3.33-option1/svm-train -t 2 -b 0 fisheriris_test2.dat optimization finished, #iter = 25 nu = 0.246475 obj = -19.080660, rho = 0.119117 nSV = 27, nBSV = 23 Total nSV = 27 $ ./libsvm-3.33-option2/svm-train -t 2 -b 0 fisheriris_test2.dat optimization finished, #iter = 22 nu = 0.246483 obj = -19.080660, rho = 0.118962 nSV = 27, nBSV = 23 Total nSV = 27
$ ./libsvm-3.33-option1/svm-train -t 2 -b 0 -e 0.00001 fisheriris_test2.dat optimization finished, #iter = 30 nu = 0.246471 obj = -19.080660, rho = 0.119060 nSV = 27, nBSV = 23 Total nSV = 27 $ ./libsvm-3.33-option2/svm-train -t 2 -b 0 -e 0.00001 fisheriris_test2.dat optimization finished, #iter = 28 nu = 0.246471 obj = -19.080660, rho = 0.119060 nSV = 27, nBSV = 23 Total nSV = 27
Thanks! This is helpful.
Very helpful, indeed!
We are using the
libsvm
library in thestatistics
package for GNU Octave. Recently we observed that when compiled with AVX both on x86_64 and aarch64 (Rasberry Pi 4) computers it produces different results compared to when AVX is not available. The version oflibsvm
that we use is 3.25, which has been available since a few releases back, but we only observed this recently when we tried to use the library as the back end for implementing the ClassificationSVM class. Details on this issue's discussion can be found on Octave's discourse post here. The differences we get are not big but they are still there and we consider that one way that this might resolved is to introduce more numerically robust algorithms intolibsvm
so that compiling with or without AVX do not produce different output.Below is the test in Octave and its complement written in Python for comparing the output.
Test from ClassificationSVM classdef in Octave statistics package.
Load the Fisher Iris dataset
iris = datasets.load_iris() x = iris.data[:, 2:4] # Use petal length and width y = iris.target
Filter to include only non-'setosa' species
non_setosa_indices = np.where(y != 0)[0] # Remove 'setosa' (label 0) x_non_setosa = x[non_setosa_indices] y_non_setosa = y[non_setosa_indices]
Convert labels to 1-based index as LIBSVM expects
Re-encode labels to be binary (1 and 2)
label_encoder = LabelEncoder() y_non_setosa_encoded = label_encoder.fit_transform(y_non_setosa) + 1 # Encode and shift labels to start from 1
Fit SVM model using RBF kernel
prob = svm_problem(y_non_setosa_encoded.tolist(), x_non_setosa.tolist()) param = svm_parameter('-t 2 -b 0') # RBF kernel, no probability estimates model = svm_train(prob, param)
Prepare test data (min, mean, max of x_non_setosa)
xc = np.array([np.min(x_non_setosa, axis=0), np.mean(x_non_setosa, axis=0), np.max(x_non_setosa, axis=0)])
Predict labels and scores
labels, _, scores = svm_predict([0] * len(xc), xc.tolist(), model, '-b 0')
Expected values for comparison (binary case)
expected_labels = [1, 2, 2] expected_scores = [0.993493, -0.080445, -0.937146]
Output the results
print("Predicted Labels:", labels) print("Scores (column 1):", [s[0] for s in scores])
Output the expected values for reference
print("Expected Labels: ", expected_labels) print("Expected Scores: ", expected_scores)
Accuracy = 0% (0/3) (classification) Predicted Labels: [1.0, 2.0, 2.0] Scores (column 1): [0.9934926709287617, -0.0804451878209585, -0.9371460818233268] Expected Labels: [1, 2, 2] Expected Scores: [0.993493, -0.080445, -0.937146]