myui / hivemall

Scalable machine learning library for Apache Hive/Spark/Pig
http://hivemall.incubator.apache.org/
503 stars 153 forks source link

Implementation of Field-aware Factorisation Machines #273

Closed ivanpartsianka closed 7 years ago

ivanpartsianka commented 8 years ago

Does it make sense to implement http://ntucsu.csie.ntu.edu.tw/~cjlin/libffm/ (that showed good performance for the CTR prediction Kaggle competitions) on top of hive?

myui commented 8 years ago

Make sense to support Field-aware FM.

However, parallel processing of FM solely using Hive is hard because data parallel processing of FM requires additional parameter mixing scheme inside UDFs. http://www.cs.cmu.edu/~yuxiangw/docs/fm.pdf http://stanford.edu/~rezab/papers/factorbird.pdf

The current FM implementation should also support parameter mixing. Intending to implement it in v0.5.

myui commented 8 years ago

FFM Implementation is on-going w/ an intern student. Stay tuned.

myui commented 8 years ago

@ivanpartsianka Implemented. It will appear in the next release. https://github.com/myui/hivemall/pull/284

ivanpartsianka commented 8 years ago

@myui thank you so much, guys. Really keen to try on real data