Open muratugureminoglu opened 2 days ago
Adding some more details so folks can identity whether they care about this issue...
summary: You are trying to operate AMS in a two-tier cluster (ie. origin transcoders & edge streamers) while simultaneously utilizing the "Auto Start/Stop Streaming" feature on all your streams. context: Your origin transcoders are big and powerful GPU laden hosts. Your edge streamers are much smaller. You don't want the edge servers performing transcoding, they aren't powered for that. You also don't want to be constantly pulling rtsp streams into the origin servers 24/7. problem: With no viewers/subscribers there is no transcoding going on at the origins (you are using Start/Stop Streaming). When the browser-viewer or "subscriber" requests a stream from any of the edge servers, that edge server becomes the origin server and starts trying to transcode the (in my case rtsp) stream with all the renditions you configured. If your edge server isn't spec'd out for transcoding (which it shouldn't be, since that's the whole point of an edge node, ie. to be smaller) then you end up with an edge node that is overwhelmed and prone to failure. Specifically, in a docker context my AMS 1.9.0 instance edge node was showing a load avg of 5-6 and a CPU% of 60/70% w/4-CPU, 16GB ram rying to pull+transcode two 1080p streams into 3-renditions and after a couple minutes in that state you see stack traces of deadlocks and then eventually the docker container dies. As a non-docker configuration you see a very similar result.
I'll use the diagram from https://antmedia.io/docs/guides/clustering-and-scaling/manual-configuration/cluster-installation/
The way I'd like it to behave...