Hydrospheredata / hydro-serving-gateway

Part of the Hydrosphere.io project.
http://docs.hydrosphere.io
Apache License 2.0
3 stars 7 forks source link

Improve shadowing execution #28

Open KineticCookie opened 4 years ago

KineticCookie commented 4 years ago

Improve A/B execution by defining return value BEFORE execution happens.

Valenzione commented 4 years ago

I'll throw in a little bit more context to let this be a good-first-issue.

Hydro-serving is able to shadow data between multiple model variants in a serving application.


i.e. A 5% canary test can look like this

Application ‘A’
    |
    | - Variant 1: model ‘a’ version 1. weight=95
    | - Variant 2: model ‘a’ version 2. weight=5

How shadowing is done:

  1. Whenever a serving application endpoint receives a request it shadows received data to all model variants for processing.
  2. Only after all model variants produce outputs we choose an output from one of these models randomly, according to the weights associated with each of these model variants.

Thus, we shadow incoming data to all model variants but return output only from a single one.

Since we wait for all model variants to finish output calculation we are left with incorrect latency which is a maximum latency of all model variants.

To improve throughput and calculate latency properly per each model variant we need to stop waiting for all model variants to produce their outputs and choose the model which output will be returned before outputs are calculated.