ut0mt8 / nginx-rtmp-module

NGINX-based Media Streaming Server
http://nginx-rtmp.blogspot.com
BSD 2-Clause "Simplified" License
86 stars 32 forks source link

Stream is constantly freezing #19

Closed darki73 closed 5 years ago

darki73 commented 5 years ago

Good time of the day!

First of all, thank you for maintaining this project! Now to the problem, it is actually a set of problems, which i have no idea where are they coming from.

Just a bit information about server:

  1. Intel Core i9-9900k
  2. 32GB 4600Mhz RAM
  3. NVMe SSD
  4. Ubuntu 18.10
  5. Nginx 1.15.9
  6. Module - latest

And here are the problems:

  1. Shaka Player (or any other player to be honest) constantly fetching the mpd file (sometimes twice or ever three times in a row without fetching segments)
  2. The duration of segments is always different no matter what i am doing (tried CBR, VBR, tried to lower the bitrate)
  3. Stream is starting just fine, but exactly 9 seconds later it just hangs, then 3 seconds later it works again, and after that, the next time it freezes, it will never resume (any player, you name it, videojs, shaka, dash, etc)

I've already tried many different solutions, but none of them are working. (I've started with the original module, and then found yours which even supports Dash QUALITY SWITCHER!)

These are my configuration files for the server itself and the www part of it.

rtmp-server-default.conf

server {
    listen 1935;
    ping 30s;
    notify_method get;
    notify_update_timeout 10s;
    chunk_size 4096;
    publish_time_fix off;

    application live {
        live on;

        # No RTMP playback
        deny play all;

        push rtmp://127.0.0.1:1935/dash-live;
        push_reconnect 1s;

        # Events
        on_publish http://app.local/api/stream/start;
        on_publish_done http://app.local/api/stream/stop;
        on_update http://app.local/api/stream/update;
    }

    application hls-live {
        live on;

        # No RTMP playback
        deny play all;

        # Only allow publishing from localhost
        allow publish 127.0.0.1;
        deny publish all;

        # Package this stream as HLS
        hls on;
        hls_path /tmp/hls/stream;

        # Put streams in their own subdirectory under `hls_path`
        hls_nested on;
        hls_fragment_naming system;
        }

    application dash-live {
        deny play all;
        allow publish 127.0.0.1;
        deny publish all;

        live on;
        dash on;
        dash_nested on; 
        dash_repetition on;
        dash_fragment 2;
        dash_playlist_length 60;
        dash_cleanup on;
        dash_clock_compensation http_head;
        dash_clock_helper_uri http://live.local/time;
        dash_path /tmp/dash/stream;
    }
}

So what i am doing here, first of all, i accept connections only to the live endpoint of the application, and there are few methods to check if user is allowed to stream, as well as update event and 'stream ended' event. The first idea was that these methods (especially update, because interval of 10 seconds is almost 9 seconds when stream freezes for the first time) is causing this issue, however, these are 'benchmark' results to see if there is any possible issue with them

[2019-03-20 15:31:26] local.INFO: array (
  'class' => 'App\\Classes\\Stream\\Manager',
  'method' => 'start',
  'execution_time' => 0.0056459903717041016,
)  
[2019-03-20 15:31:36] local.INFO: array (
  'class' => 'App\\Classes\\Stream\\Manager',
  'method' => 'update',
  'execution_time' => 0.00015020370483398438,
)  
[2019-03-20 15:31:46] local.INFO: array (
  'class' => 'App\\Classes\\Stream\\Manager',
  'method' => 'update',
  'execution_time' => 0.003345012664794922,
)  
[2019-03-20 15:31:47] local.INFO: array (
  'class' => 'App\\Classes\\Stream\\Manager',
  'method' => 'stop',
  'execution_time' => 0.000164031982421875,
)

Also, worth noting, that user just uses rtmp://live.local/live as the endpoint and provides the stream key, after that, the start method which is called on_publish returns 301 status code with location to the new stream name (to hide the stream key)

And here is the live.local.conf file

server {
    listen 80;
    listen [::]:80;

    root /var/www/data/#username#/www/streaming.live;

    index index.html index.htm index.nginx-debian.html;

    server_name live.local www.live.local;

    location /stat {
        rtmp_stat all;
        rtmp_stat_stylesheet original_stat.xsl;
    }

    location /control {
        rtmp_control all;
    }

    location /time {
        return 200;
    }

    location /stream {
        root /tmp/dash;
        add_header Access-Control-Allow-Origin * always;
        add_header Cache-Control no-cache always;
    }
}

Also, these are some of the global Nginx configuration variables:

user www-data;
worker_processes 1;
pid /run/nginx.pid;
include /etc/nginx/modules-enabled/*.conf;
load_module /usr/lib/nginx/modules/ngx_rtmp_module.so;

events {
    multi_accept        on;
    worker_connections  1024;
}

http {

    ##
    # Basic Settings
    ##

    charset         utf-8;
    sendfile        on;
    tcp_nopush      on;
    tcp_nodelay     on;
    server_tokens       off;
    log_not_found       off;
    types_hash_max_size 2048;
    client_max_body_size    2000M;
    keepalive_timeout   65;

    ///////////////////////
}

rtmp {
    include /etc/nginx/rtmp-enabled/*;
}

Doesn't matter what i try, it still freezes exactly 9 seconds into the stream (according to the counter on the player which is Shaka)

ut0mt8 commented 5 years ago

Hum as it not seem related to my specific module, I don't know if I can help. The first question is : what did you use as encoder ? there part of log that seem to be php ?

darki73 commented 5 years ago

It is just OBS with x264 4.5Mbps and AAC 160Kbps as a source streamed to the nginx-rtmp-module, the part with the 'benchmarks' PHP right.

ut0mt8 commented 5 years ago

I do not know OBS. But as it use standard x264 lib, and that you could pass option to it, you just have to find good option set. For information it is what I used with ffmpeg to test :

ffmpeg -y -i video.mp4 -c:a copy -c:v libx264 -x264opts 'keyint=50:min-keyint=50:no-scenecut' -b:v 800k -maxrate 800k -bufsize 500k video-fixed.mp4

z411 commented 5 years ago

Could you check if the availabilityStartTime is UTC time and static? Please try my fork and see how it goes: https://github.com/z411/nginx-rtmp-module

darki73 commented 5 years ago

@z411 Quoting Jensen Huang "It just works". Stream still freezes every 9 seconds (but only for a second or so, i guess i need to play around with keyframe interval and segment length), but it continues to play, and the delay went down from 20-30 seconds to just about 9 seconds from the original stream.

Even though i've disabled the strict UTC check in the Shaka (they've implemented it in the recent versions of the player just because "there are still many streams which do not utilize the UTC clock syncing for DASH so we've introduced this option to allow these streams to work", yet the fix introduced in your repository is working just fine.

@ut0mt8 According to the MPD manifest from the DASH-IF, some of the sections are not in their "intended" places, for example, SegmentTemplate must be outside the Representation. Does this in any case affects the "correctness" of the manifest file or it is just pure "doesn't matter"? Also, in their manifest, they have contentType and mimeType on the AdaptationSet itself rather than Representation.

ut0mt8 commented 5 years ago

@z411 you just revert my patch ? The idea of the patch is to follow the RTMP spec that indicate that all timestamp should be relative to the epoch sent in the initial handshake.

That said it seems that every encoders use it owns implementations, so what working for me (Elemental and others) don't work for other.

So I will make optional and everyone would be happy :)

ut0mt8 commented 5 years ago

@darki73 afaik mpd produced were compliant to the online mpd validator. And also afaik it is perfectly valid to place SegmentTemplate inside the Representation.

darki73 commented 5 years ago

Well, i have no idea on what is wrong with the setup i have right now.... I've tried HLS with Shaka only to get error that there is no Master playlist, then tried with Video.JS and it states that playlist has ended, so switched back to DASH, Shaka is still breaking up every 9 or so seconds but able to recover, Video.JS however takes much longer to recover.

And the problem from my point of view is that there are too many requests to playlists from the browser, (i know that i run 2 players, and they both must request playlist, however, i guess you will be able to see that even in this case there are just too many requests for the playlist), why i think so, previously, video.js was playing same fragments over and over again util they are removed, then it used to hang for about 10 seconds and jumped to next segments which were available. Currently keyframe and fragment set to 2 and playlist length to 30.

Here is a small video on what is happening right now: Hosted on Vimeo

Current Logic

P.S. Yes, this will be some what commercialized project (Streaming Platform for Coders), but only if people decide to "Donate" or "Subscribe" to streamer via website instead of straight Patreon or StreamAlerts interaction. So mostly Free-2-Stream/Watch. So i am not asking for myself (well kinda), but for people who are fed up with twitch "grass" growing streams in the "Science & Technology" category.

ut0mt8 commented 5 years ago

It is very difficult for me to troubleshoot such a setup. Have you tried something more basic ? for example streaming a simple video (from a mp4 file) with ffmpeg to nginx-rmtp ? When I have some time I will make this patch optional.

darki73 commented 5 years ago

Just tried (i'm currently on @z411 version of module) to simply stream from the ffmpeg, same issue.

The thing i've noticed is that:

  1. Shaka requires suggestedPresentationDelay to be set in order to calculate the manifest availability and delay, however, despite the fact that this parameter exists in the MPD, shaka is falling back to 10 seconds for some reason.
  2. All values in DASH-IF manifest are integers, and here they are represented as floats (no idea if it affects anything, but still)
  3. So, since shaka is failing to accomplish problem described in step 1, given that we know minBufferTime which is 2 seconds, according to documentation, we should use 2 * 1.5, which results in the value of 3, and here is the thing, minimumUpdatePeriod in the module's MPD is almost 3 seconds, minimumUpdatePeriod="PT3.098S" to be precise.
  4. For me, for some reason, segments are almost never of 2 seconds of length, they are between 0.5 and 7 seconds, even though i've set them to be 2 seconds (dash_fragment 2)
ut0mt8 commented 5 years ago
  1. Strange. Perhaps you could ask on shaka gh.
  2. This is internal representation of shaka variables. No problem here.
  3. Ok the bufferTime is wrong, that said it is more a problem on the shaka side.
  4. Certainly because you don't have a fixed gop size stream/file. What command line are you using ?
darki73 commented 5 years ago

GOP/Keyframe is set to 2 seconds, fragment length is set to 2 seconds, playlist length is set to 30 seconds. Also, given that stream/video is @ 60 fps, i've tried to add keyint=60 as a parameter, still no luck, still buffering every 9 seconds, and here is the thing, for exact amount of time defined in the update period (used inotify to watch for changes in variable and timer on the webpage to see current "buffering" time).

And stream starts just fine, if i start streaming on 12:00:00 (and i know that there is delay in script to post stream 10 seconds after start so the MPD is already generated), @ 12:00:20 stream is rolling on the web page with 10 seconds skipped from the start (this strange 10 seconds delay (web stream lags 10 seconds behind of actual stream), no matter what parameters i've set, but i am fine with 10 seconds delay)

ut0mt8 commented 5 years ago

Share your ffpmepg command line.

darki73 commented 5 years ago

ffmpeg -re -i /home/#user#/Videos/Recording_21_03_2019_13_25_12.mp4 -vcodec libx264 -profile:v main -preset:v medium -r 60 -keyint_min 60 -sc_threshold 0 -b:v 4500k -maxrate 4500k -bufsize 4500k -acodec libfdk_aac -b:a 160k -ar 48000 -ac 2 -f flv rtmp://live.local/live/LARAVEL_CHANNEL_SECRET_KEY

darki73 commented 5 years ago

Also tried with -force_key_frames expr:gte(t,n_forced * 2)

darki73 commented 5 years ago

And here is input from ffprobe

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'Recording_21_03_2019_13_25_12.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.12.100
  Duration: 00:01:00.22, start: 0.031995, bitrate: 4683 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 2560x1080, 4505 kb/s, 60 fps, 60 tbr, 90k tbn, 120 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 160 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
ut0mt8 commented 5 years ago

Here what I used for making a video with fixed gop:

ffmpeg -y -i video.mp4 -c:a copy -c:v libx264 -x264opts 'keyint=50:min-keyint=50:no-scenecut' -b:v 800k -maxrate 800k -bufsize 500k video-fixed.mp4

So with a 25fps framerate it procude a gop every 50frames or 2s. 800k is an example of fixed bandwith.

After you can stream it with :

ffmpeg -re -fflags +genpts -stream_loop -1 -i video-fixed.mp4 -vcodec copy -acodec copy -f flv rtmp://localhost/live/live

darki73 commented 5 years ago

Setting keyint to 120 (60 fps stream, etc) not helped aswell, same problem here. And while your solution might work for VOD (Video On Demand), i cant simply convert live stream to a "fixed" video =) But thanks for your help, i guess i will have to look @ paid media servers to accomplish this task, dont take me wrong, everything works, i am able to stream and view the stream on the website, but buffering every 10 seconds for 2 to 3 seconds will likely fear away all users and streamers and right now i dont see any possible solution to that problem.

It might be my machine, it might be version of Nginx i am using (1,15,9), hell, it can be anything else. I saw many people were managing to use this module to live stream from the webcams, where nobody else is aware of stream key (well, it just says cam1), but if i want other people to broadcast, i cant simply share their private key with others (it is a stream name, so, 5 seconds to hijack the stream).

It might even be that the "relay" is broken or not fast enough to generate these files, we need some sort of mechanism to allow people access with their stream key and generate "unique" key for the stream upon successful authentication, instead of "relaying" stream twice, once when user connected and second time to the other "application" so the new public key can be used. And this is probably where the problem lies, i dont want to sound rude, but i just cant "rollback" all the changes i've made to the system so far just to be "90%" disappointed about them not working again.

I KNOW that this is open source project, and that "requesting" features is the last thing you can do, and i dont, i just trying to figure out what is wrong. I really appreciate the work you've done to the original module and functionality you've brought in. Keep it up, apart from some edge cases, this is an amazing module and amazing fork to work with!

darki73 commented 5 years ago

Just another word from me, i am pretty sure that this is 90% issue with the relaying streams twice.

Here is current logic (and logic described in the main module repo):

  1. Allow access with stream key like: live_1_fnmpoerwfnwq3rfj82314fgq3249ejigf
  2. Check against the array of keys (i am doing it against database) if key is valid
  3. This array also has new "public" key
  4. If on PHP, just use "Location: new public key" with 301 response
  5. Push new stream to new "application", something like "hls-live"

2 Relays, 2 possible sources of problems, but again, looks like it is my system since this information is available and seems like it works for most.

P.S. I just dont understand why it works on RPi with "calculator" CPU and "writings on paper" (coz it is slow) Storage and wont work on i9 with NVMe

darki73 commented 5 years ago

Just another update, confirmed, issue lies within the double relay

darki73 commented 5 years ago

So, narrowed down the issue to the following piece of code (ommiting the dash-live configuration since it is basic dash on live on)

    application live2 {
        live on;

        # No RTMP playback
        deny play all;

        push rtmp://127.0.0.1/dash-live; #Here is the issue, and why stream is breaking up every 9 seconds, without this line, everything works just fine (assuming dash configuration in this application)
        push_reconnect 1s;

        # Events
        on_publish http://app.local/api/stream/start;
        on_publish_done http://app.local/api/stream/stop;
        on_update http://app.local/api/stream/update;
    }

P.S. And here is full configuration (application live2 was the main in my testing, live is newly created and it exposes public key)

server {
    listen 1935;
    ping 30s;
    notify_method get;
    notify_update_timeout 10s;
    chunk_size 4096;

    application live2 {
        live on;

        # No RTMP playback
        deny play all;

        push rtmp://127.0.0.1/dash-live;
#       push rtmp://127.0.0.1/hls-live;
        push_reconnect 1s;

        # Events
        on_publish http://app.local/api/stream/start;
        on_publish_done http://app.local/api/stream/stop;
        on_update http://app.local/api/stream/update;
    }

    application live {
        live on;
        dash on;
        dash_repetition on;
        dash_fragment 2;
        dash_playlist_length 60;
        dash_clock_compensation http_head;
        dash_clock_helper_uri http://live.local/time;
        dash_path /tmp/dash/test;
        on_publish http://app.local/api/stream/start;
        on_publish_done http://app.local/api/stream/stop;
        on_update http://app.local/api/stream/update;
    }

    application dash-live {
        deny play all;
        allow publish 127.0.0.1;
        deny publish all;

        live on;
        dash on;
        dash_nested on; 
        dash_repetition on;
        dash_fragment 2;
        dash_playlist_length 60;
        dash_cleanup on;
        dash_clock_compensation http_head;
        dash_clock_helper_uri http://live.local/time;
        dash_path /tmp/dash/stream-dash;
    }

    application hls-live {
        deny play all;
        allow publish 127.0.0.1;
        deny publish all;

        live on;
        hls on;
        hls_nested on;
        hls_fragment 2;
        hls_playlist_length 60;
        hls_cleanup on;
        hls_path /tmp/hls/stream-hls;
    }
}
ut0mt8 commented 5 years ago

wow I don't really understand this setup. What is the main ingest point ? and why using live2 to repush the stream ?

darki73 commented 5 years ago

The main ingest point is live (which is renamed to live2 for testing without relaying the stream).

Logic is (to hide the private key):

This is current configuration

server {
    listen 1935;
    ping 30s;
    notify_method get;
    notify_update_timeout 10s;
    chunk_size 4096;

    application live {
        live on;
        deny play all;

        push rtmp://127.0.0.1:1935/dash-live;

        on_publish http://app.local/api/stream/start;
        on_publish_done http://app.local/api/stream/stop;
        on_update http://app.local/api/stream/update;
    }

    application dash-live {
        live on;
        dash on; 
        dash_repetition on;
        dash_fragment 2;
        dash_playlist_length 60;
        dash_cleanup on;
        dash_clock_compensation http_head;
        dash_clock_helper_uri http://live.local/time;
        dash_path /tmp/dash/stream-dash;
    }
}
ut0mt8 commented 5 years ago

Does it work without relaying, aka pushing directly to dash-live ?

darki73 commented 5 years ago

Yep, no delays, no nothing, well 10 seconds delay is there, but it is fine. Stream is not buffering when pushed to dash-live, but in that case stream key is public and anyone can hijack the stream

ut0mt8 commented 5 years ago

Mhmh ok that said this is good starting point to have something working ;)

darki73 commented 5 years ago

I can share my PHP code which is used to process the events from RTMP if needed. However, i initially thought that PHP part is causing these delays (because notify_update_timeout is set to 10 seconds), however this is not the case.

P.S. Just recompiled Nginx with RTMP as static module, now it pushes 2 streams to live and dash-live, however, despite the fact that files for both streams are generated (for public and private key), relayed stream is no longer visible in the /stats route

ut0mt8 commented 5 years ago

Ah I don't know if DSO is working with this module. So yes please compile it within nginx. And clearly it not battle tested again multiple rtmp connection. Your setup should work but... problem is I don't know this part of the code. My fork/knowledge was only related to dash in a "simple" setup.

This fork was only battle tested on one use case (but a big one ; this code was used to stream the FIFA world cup to more than 1 million users in parallel for some match by the first french broadcaster)

Perhaps you have more luck with the initial author.

darki73 commented 5 years ago

I don't think so to be honest =) last activity in repo is about 2 years ago and 799 issues with no response. Issue is clearly not related to dash, but to core implementation of this module. Seems like i will have to find another media server, since i have very little knowledge in C and none in RTMP and how it works, and it is stupid to ask you to waste your own time on this thing. Anyways, thank you for your help and for trying to solve this issue!

darki73 commented 5 years ago

@ut0mt8 Just one more message from me =D

I have the following config:

    application live {
        deny play all;      

        live on;
        dash on;

        on_publish http://app.local/api/stream/start;
        on_publish_done http://app.local/api/stream/stop;
        on_update http://app.local/api/stream/update;

        dash_repetition on;
        dash_fragment 5s;
        dash_playlist_length 60s;
        dash_cleanup on;
        dash_nested on;
        dash_clock_compensation http_head;
        dash_clock_helper_uri http://live.local/time;
        dash_path /tmp/dash/stream-dash;
    }

The thing is, stream issued by the /stats page is actually the new public key, however, information about the fragments and manifest is still written in the folder of the private key.

So question is, where exactly logic for creation of folders is written? Why stats page shows stream with public key but data is still written in the folder with private key name?

Screenshot from 2019-03-27 17-21-51 Untitled

Because if we will be able to take into account the redirected name of the stream, and then write all data to that folder, there is no need to relay stream to different application, hence, there will be no issues associated with players freezing (saw many issues on the main repo about freezing, but i've actually managed to rename the stream as i wanted to, the only problem is the directory where files are saved)

P.S. And here is the logic of new public key generation written in PHP Screenshot from 2019-03-27 17-27-54

  1. Extract application and name from the request
  2. Check if request actually has these values (expecting to get two, so counting them)
  3. Plucking $key (name) and $app (application) from resulting array
  4. Caching channel information (here we are checking if channel with this streaming key exists and if it is, caching it in Redis)
  5. Checking if model is null, if it is, then incorrect stream key is provided
  6. Fetching User associated with the Channel
  7. Checking if User is banned from streaming
  8. Creating new Stream entry (with all data associated with RTMP stream and Live status)
  9. Generating cache key form the model (to eliminate further requests to database)
  10. Updating stream cache with default information (everything set to null)
  11. Sending out event to browser that stream has started
  12. Returning 301 response and setting Location to SHA-512 hash of the Stream UUID

The key resulted from step 12 is actually new public key, and as you can see, stats page actually returns this exact key, however, data is still written to the key obtained from Step 3