cjlin1 / libsvm

LIBSVM -- A Library for Support Vector Machines
https://www.csie.ntu.edu.tw/~cjlin/libsvm/
BSD 3-Clause "New" or "Revised" License
4.55k stars 1.64k forks source link

Re-Implement file parser to also allow loading from in-memory buffers #170

Closed seijikun closed 10 months ago

seijikun commented 4 years ago

The current libsvm only allows loading trained svm-models from a file with the given name. Especially for use in mobile applications, parsing a model from a buffer would be much appreciated.

The PR consists of two commits:

Correctness of the parser was tested with the following code - which simply compared the same model parsed from a buffer and a file (in between both commits):

int main() {
    struct svm_model* svmModelFile = svm_load_model("/tmp/cache/activity.model");
    struct svm_model* svmModel = svm_parse_model_from_buffer(MODEL.data(), MODEL.size());
    assert((svmModel == nullptr && svmModelFile == nullptr) || (svmModel != nullptr && svmModelFile != nullptr));
    if(svmModel == nullptr) { return 1; }

    assert(svmModel->l == svmModelFile->l);
    assert(svmModel->nr_class == svmModelFile->nr_class);
    assert(memcmp(&svmModel->param, &svmModelFile->param, sizeof(svm_parameter)) == 0);
    assert(svmModel->free_sv == svmModelFile->free_sv);

    size_t nr_class_permut = svmModel->nr_class * (svmModel->nr_class - 1) / 2;
    for(size_t i = 0; i < nr_class_permut; ++i) {
        assert(svmModel->probA[i] == svmModelFile->probA[i]);
        assert(svmModel->probB[i] == svmModelFile->probB[i]);
        assert(svmModel->rho[i] == svmModelFile->rho[i]);
    }
    for(size_t i = 0; i < svmModel->nr_class; ++i) {
        assert(svmModel->label[i] == svmModelFile->label[i]);
        assert(svmModel->nSV[i] == svmModelFile->nSV[i]);
    }

    for(size_t c = 0; c < svmModel->nr_class - 1; ++c) {
        for(size_t i = 0; i < svmModel->l; ++i) {
            assert(svmModel->sv_coef[c][i] == svmModelFile->sv_coef[c][i]);
        }
    }

    for(size_t i = 0; i < svmModel->l; ++i) {
        struct svm_node* supportVector = svmModelFile->SV[i];
        size_t j = 0;
        for(j; supportVector[j].index != -1; ++j) {
            assert(svmModel->SV[i][j].index == supportVector[j].index);
            assert(svmModel->SV[i][j].value == supportVector[j].value);
        }
        assert(svmModel->SV[i][j].index == supportVector[j].index);
    }

    svm_free_and_destroy_model(&svmModel);
    svm_free_and_destroy_model(&svmModelFile);
}
KyleLin123456 commented 10 months ago

@seijikun Hi, I would like to ask about the change in https://github.com/cjlin1/libsvm/pull/170/commits/adbbfe236387f14080b4012775ab271fa2aa57c9 Why would you unset the the parameter "stderr" in line 292? Is there any meaningful benefit? Thank you so much for the answer.

cjlin1 commented 10 months ago

We take the grid.py change for accepting file names with spaces. Other changes are not taken.