ucbrise / clipper

A low-latency prediction-serving system
http://clipper.ai
Apache License 2.0
1.4k stars 280 forks source link

Criteo ad serving simulator #463

Open ryanhoque opened 6 years ago

ryanhoque commented 6 years ago

Having a long-running Clipper cluster under an active workload will serve as a stress-test for the system. More specifically, we will deploy Clipper on Kubernetes with Redis configured to run in fault-tolerant mode and query Clipper's REST interface. We will be training the first place model from the 2014 display advertising Kaggle competition with a dataset from Criteo and deploy it to Clipper every few hours. Later the model training pipeline can be generalized to arbitrary models.

Repo: https://github.com/ucbrise/clipper-serving-testbed Design Doc: https://docs.google.com/document/d/13HZvSnTj6trosyv4SenoHLj9fcoGPdzgGBOff14arTw/edit?usp=sharing

simon-mo commented 6 years ago

@ryanhoque If you need help on the metric part, feel free to ping me. We can make a public Grafana dashboard and show everyone the metrics.

For example, wikipedia has great public metric dashboard: https://grafana.wikimedia.org/dashboard/db/performance-metrics?refresh=5m&orgId=1

ryanhoque commented 6 years ago

@simon-mo That'd be great!

dcrankshaw commented 6 years ago

What is the current status of this? @ryanhoque were you able to get this finished before the end of the semester?

ryanhoque commented 6 years ago

@dcrankshaw I'm still having an issue with specifying an external redis cluster and I have to coordinate with Simon to finish up metrics, but it shouldn't take more than a few hours.

rkooo567 commented 5 years ago

@simon-mo Is this done before? Seems like you guys tried to setup the stress test.