Closed IMCG closed 7 years ago
@IMCG Thanks for your feedback.
You can download the dataset here http://labs.criteo.com/2015/03/criteo-releases-its-new-dataset/ . Note that I did not use the full dataset. The original one contains 4 billion lines, and I only use 36 million for testing purpose.
With running instruction, the front page should work. I will add more details if you find anything unclear.
BTW, both local mode and distributed mode are tested on 40-core machines.
@IMCG
Moreover, if you want to run it in real-distributed mode, you need to change the [ClusterSpec] in sample.cfg to your real cluster setting. Thanks.
[ClusterSpec] ps_hosts = localhost:2340,localhost:2341
worker_hosts = localhost:2342,localhost:2343
@kopopt Thank you. we will give it a try.
@kopopt could you please provide more details on evaluation and training? or how to run it on some established dataset instead of criteo sample with exotic format? EDIT I figured out that format is very similar to svmlight, but still not sure how to get the data on which you experimented.
@kopopt Hi,
According to the instructions
I am wondering how to get this result with distributed Tensorflow: " Configuration: 36672494 training examples, 10 threads, factor_num = 8, batch_size = 10000, epoch_num = 1, vocabulary_size = 40000000 Cluster: 1 ps, 4 workers. FastTffm: 49 seconds. 748418 examples / second."
Could you provide the dataset and detail running instructions? Recently, we have a distributed version of RDMAable-Tensorflow and we would like to evaluated RDMA-based Tensorflow with this benchmark. Any help wolud be greatly appreciated! Thanks.