Using Machine Learning to Improve Streaming Quality at Netflix
Summary
Providing a quality streaming experience for this global audience is an immense technical challenge
install and maintain servers throughout the world
algorithms for streaming content from those servers to our subscribers’ devices
diverse viewing behavior
Viewing/browsing behavior on mobile devices is different than on Smart TVs
Cellular networks may be more volatile and unstable than fixed broadband networks
Networks in some markets may experience higher degrees of congestion
device groups have different capabilities and fidelities of internet connection due to hardware differences
Network quality characterization and prediction
the average bandwidth and round trip time supported by a network are well-known indicators
stability and predictability make a big difference when it comes to video streaming
A richer characterization of network quality would prove useful for analyzing network, determining initial video quality and/or adapting video quality throughout playback
How can we incorporate longer-term historical information about the network and device?
combine temporal pattern recognition with various contextual indicators to make more accurate predictions of network quality
Video quality adaptation during playback
Adaptive streaming algorithms are responsible for adapting which video quality is streamed throughout playback based on the current network and device conditions
The quality of experience can be measured in several ways
including the initial amount of time spent waiting for video to play
the overall video quality experienced by the user
the number of times playback paused to load more video into the buffer (rebuffer)
the amount of perceptible fluctuation in quality during playback
These metrics can trade off with one another
The feedback signal of a given decision is delayed and sparse
when learning optimal control algorithms, and machine learning techniques (e.g., recent advances in reinforcement learning) have great potential to tackle these issues.
Predictive caching
statistical models can improve the streaming experience is by predicting what a user will play in order to cache (part of) it on the device before the user hits play
By combining various aspects of their viewing history together with recent user interactions and other contextual variables
Device anomaly detection
Netflix operates on over a thousand different types of devices
it is not uncommon to cause a problem for the user experience
the app will not start up properly, or playback will be inhibited or degraded in some way
Detecting these changes is a challenging and manually intensive process.
Alerting frameworks are a useful tool for surfacing potential issues but oftentimes it is tricky to determine the right criteria for labeling something as an actual problem
Fortunately, we have history on alerts that were triggered as well as the ultimate determination (made by a human) of whether or not the issue was in fact real and actionable
it is often challenging to determine the root cause
Statistical modeling can also help us determine root cause by controlling for various covariates
statistical modeling and machine learning methods can improve the state of the art
there is sufficient data
the data is high-dimensional and it is difficult to hand-craft the minimal set of informative variables for a particular problem
there is rich structure inherent in the data due to complex underlying phenomena
Title
Summary
Reference