hongzimao / pensieve

Neural Adaptive Video Streaming with Pensieve (SIGCOMM '17)
http://web.mit.edu/pensieve/
MIT License
517 stars 280 forks source link

Questions regarding QoE metric #77

Open jlabhishek opened 5 years ago

jlabhishek commented 5 years ago

Hello sir, I had changed the bitrate levels available for training but I noticed that as the bitrates get higher the model tends to prefer playing as those higher bitrates even if there is rebuffering. For example if bitrate is 6000 and next lower bitrate is 2200 Kbps and rebuffering is for 0.2 seconds only,model stills gets a good positive reward for playing at 6000 Kbps, 6-(4.3*0.1) reward value. Also I noticed that sometimes there is too much switching if the buffer_thresh is kept small in env.py . Can you suggest some hyperparameter changes In reward function to tackle this issue?

Thank you

hongzimao commented 5 years ago

RL optimizes for the reward you specify. If the benefit of getting high bitrate overwhelms the penalty from rebuffering, it will just choose high bitrate (it's doing its job). You can try increasing the rebuffer penalty in the reward function.

wang88256187 commented 5 years ago

RL optimizes for the reward you specify. If the benefit of getting high bitrate overwhelms the penalty from rebuffering, it will just choose high bitrate (it's doing its job). You can try increasing the rebuffer penalty in the reward function.

Hello, Mr.Mao: I'am about to validate the effect of your experiment, but got some problems when I was going to analyze the performance of the model I trained in virtual network condition. I wonder how to visualize the QoE of the experiment results instead of the reward. Moreover, the reward analysis is also not clear. i.e., How to get the conclusion of Figure 9(a) and (b) in your paper? I think this is not shown in your code. Hoping to get your help~ thank you!