Inconsistent results from libffm

aksnzhy / xlearn

High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

https://xlearn-doc.readthedocs.io/en/latest/index.html

Apache License 2.0

3.09k stars 519 forks source link

Inconsistent results from libffm #245

Closed gsakkis closed 10 months ago

gsakkis commented 5 years ago

I'm trying to compare xlearn with libffm in terms of consistency on the same dataset (demo/classification/criteo_ctr/small*.txt) and I'm getting different results. That's even after ruling out some common suspects (using same hyperparameters, setting number of threads to 1, disabling the random shuffling).

Has anyone been able to produce the same trained model with both packages and if so how? If not, does it mean that one implementation (or both) have a bug or there is a legitimate reason for non-deterministic models?

aksnzhy commented 5 years ago

@gsakkis Hi, the inconsistent result comes from many reasons: xLearn and libffm has different optimization method, different model initialization, different data shuffle, and so on..

gsakkis commented 5 years ago

I see, thanks. It would be nice if there was a combination of parameters that make xLearn behave like libffm for getting consistent reproducible results but I guess that's not currently supported. Feel free to close this issue if it's not worth the effort to add such a "backwards compatibility" mode in the future.