Benchmark against SageMaker and Tensorflow Serving

ucbrise / clipper

A low-latency prediction-serving system

http://clipper.ai

Apache License 2.0

1.4k stars 280 forks source link

Benchmark against SageMaker and Tensorflow Serving #588

Open simon-mo opened 5 years ago

simon-mo commented 5 years ago

A blog post?

wcwang07 commented 5 years ago

in the paper the object recognition experiment was setup using cifar-10, imagenet, would you consider using bench marking using coco data instead? would you be open source model network that you use?

wcwang07 commented 5 years ago

ran uniform load with single arrival process sending one randomly generated color image of size 224x224 for every 120 seconds

wcwang07 commented 5 years ago

In TFS, here they set batching parameter to test model latency https://github.com/tensorflow/serving/issues/344

we know that clipper also use adaptive batching mechanism that is determined by https://github.com/ucbrise/clipper/blob/3c5a1cc6ce59e0ccd778f526a50808d0e7b2576f/src/libclipper/src/containers.cpp#L128 (https://github.com/ucbrise/clipper/issues/548)

could you show whether batching size can be set at runtime just like TFS? if so what different sizes have you tried for image recognition benchmark?

Thank you.

simon-mo commented 5 years ago

Clipper cannot dynamically change arbitrary batchsize at runtime, however you can set the max_batch_size similar to TFS when you deploy the model: http://docs.clipper.ai/en/v0.3.0/clipper_connection.html#clipper_admin.ClipperConnection.build_and_deploy_model

batch_size (int, optional) – The user-defined query batch size for the model. Replicas of the model will attempt to process at most batch_size queries simultaneously. They may process smaller batches if batch_size queries are not immediately available. If the default value of -1 is used, Clipper will adaptively calculate the batch size for individual replicas of this model.