kopopt / fast_tffm

fast_tffm: Tensorflow-based Distributed Factorization Machine
Apache License 2.0
144 stars 50 forks source link

how to get this results? #4

Closed IMCG closed 7 years ago

IMCG commented 8 years ago

@kopopt Hi,

According to the instructions

I am wondering how to get this result with distributed Tensorflow: " Configuration: 36672494 training examples, 10 threads, factor_num = 8, batch_size = 10000, epoch_num = 1, vocabulary_size = 40000000 Cluster: 1 ps, 4 workers. FastTffm: 49 seconds. 748418 examples / second."

Could you provide the dataset and detail running instructions? Recently, we have a distributed version of RDMAable-Tensorflow and we would like to evaluated RDMA-based Tensorflow with this benchmark. Any help wolud be greatly appreciated! Thanks.

kopopt commented 8 years ago

@IMCG Thanks for your feedback.

You can download the dataset here http://labs.criteo.com/2015/03/criteo-releases-its-new-dataset/ . Note that I did not use the full dataset. The original one contains 4 billion lines, and I only use 36 million for testing purpose.

With running instruction, the front page should work. I will add more details if you find anything unclear.

BTW, both local mode and distributed mode are tested on 40-core machines.

kopopt commented 8 years ago

@IMCG

Moreover, if you want to run it in real-distributed mode, you need to change the [ClusterSpec] in sample.cfg to your real cluster setting. Thanks.

[ClusterSpec] ps_hosts = localhost:2340,localhost:2341

worker_hosts = localhost:2342,localhost:2343

IMCG commented 7 years ago

@kopopt Thank you. we will give it a try.

dselivanov commented 7 years ago

@kopopt could you please provide more details on evaluation and training? or how to run it on some established dataset instead of criteo sample with exotic format? EDIT I figured out that format is very similar to svmlight, but still not sure how to get the data on which you experimented.