Block size bug and normalization confusion

aksnzhy / xlearn

High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

https://xlearn-doc.readthedocs.io/en/latest/index.html

Apache License 2.0

3.09k stars 518 forks source link

Block size bug and normalization confusion #121

Open makailove123 opened 6 years ago

makailove123 commented 6 years ago

Reader.SetBlockSize funciton should be called before Reader.Initialize. Otherwise, the real memory allocated is not equal to the block size seted up. According to the code, I think default normalization is L2, right? But, if I feed preprocessed L2-normalization data to the tool, the result is absolutely different from that I put source data to the tool. Is this normal?

aksnzhy commented 6 years ago

@makailove123 We will fix this problem! Yes, xLearn uses L2 regularizer and instance-wise norm by default. I think you need to tune the hyper-parameters for different dataset.