StanfordSNR / puffer

Puffer is a free live TV streaming website and a research study at Stanford using machine learning to improve video streaming
https://puffer.stanford.edu
835 stars 131 forks source link

Fix "popping" sound every 4.8 seconds #11

Closed keithw closed 5 years ago

keithw commented 5 years ago

These commits fix the "popping" sound that was happening every 4.8 seconds by:

(This also fixes the small glitches we had in the audio timestamps.)

This requires a change to the command-line used to invoke the decoder to give the number of samples of overlap between two .wav chunks. To keep things from getting too complex, the only allowable value is "10248", but this still has to be given explicitly on the command-line, e.g.:

src/atsc/decoder 33 36 1080i30 60 900 10248 video video < leno2.ts

The commits also include a small change to webm_fragment to allow the new files generated by opus-encoder -- I think the change is okay but I don't understand the code that well! Please look it over as I could easily have made a mistake.

Finally, the commits move the Travis-CI builds to Docker on Ubuntu 18.04 since the code depends on a recent version of libavformat.

francisyyan commented 5 years ago

Thank you Keith! These commits look great! I should find some time to understand the technique behind them, but for now, I will merge them first since they work well in my tests.

It took me a while to recall why I checked and removed BlockGroup at the end. It was actually a workaround that I used before when the timestamps in the audio filenames might be not accurate (i.e., when we used ffmpeg instead of our own decoder to decode audio); the timecode in BlockGroup was the only accurate information source back then. Given that we have a better decoder now, it is okay to make webm_fragment more permissive, and I will push another commit that removes this workaround.

On the other hand, I just found that webm_fragment would be buggy if BlockGroup also contained audio data. This typically wouldn't happen because by default, opusenc or ffmpeg will generate a bunch of SimpleBlock elements containing audio data, and an additional BlockGroup element at the end which contains extra information only (without audio data); this won't happen with the new opus-encoder either as no BlockGroup is generated anymore. I will keep an eye on this in the future.