knative / serving

Kubernetes-based, scale-to-zero, request-driven compute
https://knative.dev/docs/serving/
Apache License 2.0
5.54k stars 1.15k forks source link

I can use Knative to serve my Tensorflow model #816

Closed mattmoor closed 5 years ago

mattmoor commented 6 years ago

Using Knative, I can load my trained tensorflow model and statelessly serve it at hyperscale.

cc @mchmarny

mbehrendt commented 6 years ago

+1 -- this is a very common use case we're seeing.

how would you define 'hyperscale', in a measureable unit :-) ?

evankanderson commented 6 years ago

/kind doc

I'd define hyperscale as >100k qps (requests/second). Other definitions may vary. Measured in CPU-seconds/second, I'd expect hyperscale to be >10k CPU-seconds/second.

evankanderson commented 6 years ago

Is this worth keeping open?

mattmoor commented 5 years ago

I believe there were talks on this at Kubecon. Maybe @mchmarny would want to put together a sample for the knative/docs repo, but that should be tracked there.