Closed lastcanal closed 1 week ago
Amazing!
Love to see you used FLAME for this. With :syn
the pipelines can resume from another FLAME runner/node right?
Can we keep the source rendition in the manifest as well?
Yes, the pipeline should resume even under FLAME. The RTMP stream connects to an already running application server which will then launch the Pipeline on a FLAME node and forward audio and video messages via RPC. This means that the RTMP stream will be connected to a public-facing application server while the pipeline and transcoding will happen on a non-public-facing FLAME server (with the fly backend). If the streamer reconnects to another application server then syn
will find the FLAME node running their pipeline and the stream of RPC messages starts again. I still haven't actually tried this with FLAME's Fly backend, only with multiple nodes running locally; I am going to try to get it running on Fly this week, I also want to try out the hardware transcoding.
As for including the source in the manifest, absolutely! I will add an extra configuration option to allow the un-trans-coded source version to be included in the manifest. It is currently only included if you disable all transcoding by unsetting the TRANSCODE
environment variable.
My mornings are much more open this week (I'm UTC-5), I will set up a meeting and can walk you through everything.
This means that the RTMP stream will be connected to a public-facing application server while the pipeline and transcoding will happen on a non-public-facing FLAME server (with the fly backend). If the streamer reconnects to another application server then
syn
will find the FLAME node running their pipeline and the stream of RPC messages starts again.
That's perfect, can't wait to try this out! Thanks for all the changes :heart_hands:
Should be good to merge once it's tested on prod. Currently I'm getting the error below, not sure where the membrane_h26x_plugin
dependency is coming from
=> CACHED [builder 14/17] RUN mix compile 0.0s
=> CACHED [builder 15/17] COPY config/runtime.exs config/ 0.0s
=> CACHED [builder 16/17] COPY rel rel 0.0s
=> ERROR [builder 17/17] RUN mix release 1.9s
------
> [builder 17/17] RUN mix release:
1.177 * assembling algora-0.1.0 on MIX_ENV=prod
1.177 * using config/runtime.exs to configure the release at runtime
1.881 ** (Mix) Duplicated modules:
1.881 'Elixir.Membrane.H265.Parser' specified in membrane_h265_plugin and membrane_h26x_plugin
using a modified Dockerfile with
ARG BUILDER_IMAGE="hexpm/elixir:1.17.3-erlang-26.2.5.5-debian-bookworm-20241016-slim"
ARG RUNNER_IMAGE="debian:bookworm-20241016-slim"
I've removed membrane_h265_plugin
in favor of the newer membrane_h26x_plugin
. That should fix the problem. I've also removed my fork of membrane_rtmp_plugin
because my changes got merged upstream!
I've pushed cdc836c which changes how low-latency HLS partials are served. Currently partial segments are served from the application server; however, with this change partial segments are now uploaded to Tigris and deleted when they are no longer needed. When a partial segment is ready, including after waiting on a X-PRELOAD-HINT
, then a 302 redirect to the Tigris bucket is served.
Here is a screencast of the 302 redirect in action
Just deployed on staging, works great overall!
I think there's a regression with thumbnail generation, can we fix that?
Btw is there anything wrong with triggering toggle_streamer_live
on :end_of_stream
, :resume_rtmp
and init
, instead of only on :terminate
?
When a partial segment is ready, including after waiting on a
X-PRELOAD-HINT
, then a 302 redirect to the Tigris bucket is served.
That's awesome! Now that we are serving partials from Tigris, can we eliminate the LLController.broadcast!
calls to save bandwidth?
At the moment we spawn a new LLController
instance per stream per node and they all cache partials independently, but I think we should be able to get away with a single LLController
in the same node as the pipeline that blocks playlist requests and redirects clients to Tigris
How about we create a new branch for the LL-HLS stuff and I'll go ahead and merge this
Just deployed on staging, works great overall!
That's great!
I think there's a regression with thumbnail generation, can we fix that?
I will look into this. Most likely related to the latest commit cdc836c.
Btw is there anything wrong with triggering
toggle_streamer_live
on:end_of_stream
,:resume_rtmp
andinit
, instead of only on:terminate
?
This only issue is that when toggle_streamer_live(false) is called the manifest url gets changed to Tigris. That part could get split into another function that gets called on terminate.
When a partial segment is ready, including after waiting on a
X-PRELOAD-HINT
, then a 302 redirect to the Tigris bucket is served.That's awesome! Now that we are serving partials from Tigris, can we eliminate the
LLController.broadcast!
calls to save bandwidth?
We still need to continue broadcasting to every instance, but we no longer broadcast any video or audio over the cluster. I am sure it can be cleaned up, but now it only sends and stores the :ready
atom in ets
At the moment we spawn a new
LLController
instance per stream per node and they all cache partials independently, but I think we should be able to get away with a singleLLController
in the same node as the pipeline that blocks playlist requests and redirects clients to Tigris
I think we will still want a LLController
per app server because web clients waiting for preload hints messages would cause a-lot of inter-cluster messages. We could try using https://github.com/discord/manifold , but we would still need a way to distribute and cache manifests on each node.
How about we create a new branch for the LL-HLS stuff and I'll go ahead and merge this
Sounds good! I'll drop cdc836c and push it to a new branch.
This only issue is that when toggle_streamer_live(false) is called the manifest url gets changed to Tigris. That part could get split into another function that gets called on terminate.
Gotcha, yeah that makes sense
We still need to continue broadcasting to every instance, but we no longer broadcast any video or audio over the cluster. I am sure it can be cleaned up, but now it only sends and stores the
:ready
atom in ets
Oh I haven't noticed, that's perfect!
I think we will still want a
LLController
per app server because web clients waiting for preload hints messages would cause a-lot of inter-cluster messages. We could try using https://github.com/discord/manifold , but we would still need a way to distribute and cache manifests on each node.
Agreed, let's keep it that way
I think there's a regression with thumbnail generation, can we fix that?
I will look into this. Most likely related to the latest commit cdc836c
Looks like the pattern never matches because segment_sn
is obsolete, we need to match on sequences: %{}
Hello! This PR includes all of #113 and adds adaptive bit-rate transcoding to the live streaming pipeline. The wonderful team over at Membrane have merged the changes I needed to the RTMP plugin and made this PR possible. Although not everything in this PR has been tested, anything un-tested is only available behind a feature flag. I have added issue #115 to track these features as they are tested.
3 transcoding backends are available: Software (ffmpeg), Nvidia, and Xilinx. I have only tested the Software backend. Transcoding in the pipeline can be configured with the
TRANSCODING
environment variable. The format is<height>p<framerate>@<bitrate>
separated by a pipe|
. For example:1440p30@4000000|720p30@2000000
.To avoid overloading the application servers I have setup the pipelines to run inside FLAME, which will in theory use Fly to boot a new server and start the pipeline there. I have only tested the local backend.
H265 video is supported, but unfortunately Vidstack offers both h264 and h265 tracks to the viewer if h265 tracks are supported by their browser. H265 can be disabled with
SUPPORTS_H265=false
.The following new environment variables have been added:
The following configuration should allow the pipeline to operate how it does today, except with re-connectable RTMP: