Improve graceful degradation when Velox is overloaded with requests

Velox silently drops requests under high load (see #20). This will be unavoidable to some extent, but right now nothing is done to mitigate this. We should improve our graceful degradation of the system, as well as determine roughly how many concurrent requests Velox can handle. Having this information helps users determine how to provision the cluster etc. when deploying a Velox-backed application.

I think the first step is to basically do a binary search on number of concurrent requests to figure out when Velox starts dropping requests. Once we have this information, we need to figure out if Velox can start load-shedding in a more graceful way (e.g. when number of requests approaches max, start returning 500s).

amplab / velox-modelserver

Improve graceful degradation when Velox is overloaded with requests #35