kaltura / nginx-vod-module

NGINX-based MP4 Repackager
GNU Affero General Public License v3.0
1.96k stars 431 forks source link

Playback issues when file is replaced #1496

Closed zWaR closed 6 months ago

zWaR commented 7 months ago

Hey there,

I am experiencing some hard to troubleshoot problems in our system. Generally speaking nginx-vod-module is working perfectly fine with our vods, however at times our users are reporting problems with playback. In order to go into the explanation what happens, let me just share a bit about our system. In order to reduce our storage space demand, we regularly transcode our video assets. Transcoded files are not fragmented and they do not have errors such as those mentioned here: https://github.com/kaltura/nginx-vod-module/issues/1075

However, what we do is that during the transcoding proces we use files from the path from which nginx-vod-module serves our vods as input to ffmpeg and output them to a temporary location. If transcoding finishes successfully, then we rsync the output files from the temporary location to the original location and we overwrite the original files with the transcoded ones. Most of the times this does not affect nginx-vod-module, but sometimes it does and when this happens, video playback breaks.

I tried several different approaches to resolve this issue, but the most consistent is to reinstall nginx and nginx-vod-module, this always resolves it. To me this indicates that the mp4 file itself is not a problem. However, replacing the file might be. Do you have any suggestions on how to go about that? Is it possible to clean module's cache or buffer or force nginx-vod-module to re-process the replaced file (I am just guessing, so I'm sorry if this options make no sense)?

I suppose (I did not test that) a fix of the problem might be to rename the file after transcoding instead of an overwrite, but I'd like to avoid doing that if possible.

zWaR commented 7 months ago

I was able to track down the behavior to vod_metadata_cache. If I disable it, the problems described above disappear.

I wonder, is there a way to clear this cache other than restarting nginx?

zWaR commented 6 months ago

Digging further I realized I can set expiration for vod_metadata_cache. I tested this and it seems to work as well. This will probably work fine enough. However, if you can think of a better way of doing this, which involves cleaning cache on demand, I'm all ears!

erankor commented 6 months ago

Replacing a file in place is highly discouraged, I believe it was mentioned in a few other issues in the past. As you mentioned, there's a problem with the metadata cache - the module may try to use the metadata of the first video, against the data of the replaced video. So, for example, it may read some offset in the file, expecting to find a video key frame, and instead get a portion of an audio frame, or whatever... But, other than that, a user playing the video may be getting of mix of segments, some before the transcode, and some after the transcode. Some players may accept this, but some players choke if the video parameters change during playback (without EXT-X-DISCONTINUITY). If you are using a CDN or a caching proxy (and if you don't, you probably should...) this problem is not limited to a user playing the video at the time of transcode, you may have this mix of old/new segments in CDN cache. I think the correct solution is to avoid replacing existing files - write the transcoded video to a new file. You can, for example, add a 'version' to the file name, and increment it whenever a new version is available. When a user requests the stream, build a URL pointing to the latest version. Some time after a version is superseded, when you're sure no one is still playing the old version, you can delete the old file.

zWaR commented 6 months ago

@erankor thank you very much for this extensive explanation. Yes, it makes sense what you're saying and it is exactly what I was noticing on our end. Thank you too for the pointers on how to address the problem - very helpful!

zWaR commented 6 months ago

I think it's safe to close this one. 😅