Implement Canary Analysis of newly-deployed ML models to run along side existing models - allow rollback if possible

cfregly commented 8 years ago

provide metrics on both system and prediction performance to allow multiple levels of canary analysis

for ML model canary analysis, we'll want to compare

(userId, itemId, currentPrediction)  <-->  (userId, itemId, newPrediction)

we want difference between currentPrediction and newPrediction to be close to 0 or within some tolerance

cfregly commented 8 years ago

cfregly commented 8 years ago

related to https://github.com/fluxcapacitor/pipeline/issues/61 for alerts/notifications of bad canaries

cfregly commented 8 years ago

Notes:

online evaluation is pinning 1 model (or multi-armed bandit) against another vs offline

online is best, offline is preliminary smoke test. requires online system which is not common. we hope to make it common.

form of canary analysis of new model against existing cluster.

some traffic goes to new model canary.

make sure new model is performing within acceptable tolerance of existing model, otherwise remove it and try again.

simple monitoring: golden set of data scored live against newly-deployed model

complex: deploy canary of new model alongside cluster of old model.

compare as part of canary analysis. keep an eye out for Netflix ACA (automated Canary Analysis)

cfregly commented 7 years ago

zuul screenshots:

cfregly commented 7 years ago

PipelineAI / pipeline