ossrs / srs

SRS is a simple, high-efficiency, real-time media server supporting RTMP, WebRTC, HLS, HTTP-FLV, HTTP-TS, SRT, MPEG-DASH, and GB28181.
https://ossrs.io
MIT License
25.77k stars 5.39k forks source link

When utilizing SRS Edge to forward traffic to the origin cluster, the traffic distribution is highly uneven. #4187

Closed nighttidesy closed 1 month ago

nighttidesy commented 1 month ago

Describe the bug When utilizing SRS Edge and SRS Origin clusters, it has been observed that the majority of the stream pushing from SRS Edge is directed to the first node of the SRS Origin cluster, resulting in significantly less traffic to the other nodes and thus causing an imbalance in traffic distribution.

Version srs4

To Reproduce

  1. SRS Edge Configuration [root@k8s-ops srs-cluster]# cat srs-edge.yaml apiVersion: v1 kind: ConfigMap metadata: name: srs-edge-config data: srs.conf: |- listen 1935; max_connections 1000; daemon off; Enable smooth exit feature. grace_start_wait 700; grace_final_wait 800; force_grace_quit on; http_api { enabled on; listen 1985; } http_server { enabled on; listen 8080; } vhost defaultVhost { cluster { mode remote; origin srs-origin-0.socs srs-origin-1.socs srs-origin-2.socs; token_traverse on; } http_remux { enabled on; mount [vhost]/[app]/[stream].flv; hstrs on; } }

apiVersion: apps/v1 kind: Deployment metadata: name: srs-edge-deploy labels: app: srs-edge spec: replicas: 3 revisionHistoryLimit: 10 selector: matchLabels: app: srs-edge template: metadata: annotations: version/config: "20231228201317" labels: app: srs-edge spec: volumes:


apiVersion: v1 kind: Service metadata: name: srs-edge-service spec: type: NodePort selector: app: srs-edge ports:

Configure SRS (Simple Real-time Streaming) origin settings. [root@k8s-ops srs-cluster]# cat srs-origin.yaml apiVersion: v1 kind: ConfigMap metadata: name: srs-origin-config data: srs.conf: |- listen 1935 { idle_timeout 600; } max_connections 1000; daemon off; srs_log_tank console; srs_log_file ./objs/srs.log; http_api { enabled on; listen 1985; } http_server { enabled on; listen 8080; dir ./objs/nginx/html; } vhost defaultVhost { cluster { origin_cluster on; coworkers srs-origin-0.socs:1985 srs-origin-1.socs:1985 srs-origin-2.socs:1985; }

    play {
        time_jitter             full;
        mix_correct             on;
    }

Configure HTTP-FLV settings. http_remux{ enabled on; mount [vhost]/[app]/[stream].flv; hstrs on; }

    hls {
        enabled    off;
    }

Configure DVR settings. dvr { enabled on; dvrpath /app/nfs/livevideo/srs/[app]/[stream][timestamp].flv; dvr_plan segment; Split video files by time. dvr_duration 600; dvr_wait_keyframe on; }

HTTP callback. http_hooks { enabled on;

AVS management side token verification interface. on_publish http://avs-admin:9102/v1/token/check;

When the client starts streaming, the AVS management side token verification interface. on_play http://avs-admin:9102/v1/token/check; }

}

apiVersion: v1 kind: Service metadata: name: socs spec: clusterIP: None selector: app: srs-origin ports:


apiVersion: apps/v1 kind: StatefulSet metadata: name: srs-origin labels: app: srs-origin spec: serviceName: "socs" replicas: 3 selector: matchLabels: app: srs-origin template: metadata: labels: app: srs-origin spec: volumes:


SRS currently does not support cluster-level APIs. To retrieve streaming live broadcast information from all origin servers, it is necessary to iteratively call the API of each origin server and aggregate the data.

apiVersion: v1 kind: Service metadata: name: srs-api-service-0 spec: type: NodePort selector:

app: srs-origin

statefulset.kubernetes.io/pod-name: srs-origin-0

ports:

apiVersion: v1 kind: Service metadata: name: srs-api-service-1 spec: type: NodePort selector:

app: srs-origin

statefulset.kubernetes.io/pod-name: srs-origin-1

ports:

apiVersion: v1 kind: Service metadata: name: srs-api-service-2 spec: type: NodePort selector:

app: srs-origin

statefulset.kubernetes.io/pod-name: srs-origin-2

ports:

  1. Use a stress testing tool to push 30 streams. By accessing the origin API interface, it is observed that all streams are located on the first origin node, named srs-origin-0.socs. image

Could you please explain the reason for this? Is it because the edge does not support round-robin forwarding of traffic to the origin by design?

TRANS_BY_GPT4

nighttidesy commented 1 month ago

Is @winlinvip designed this way intentionally? If the traffic is not balanced, the origin server can easily reach its performance limit. Additionally, can I add a load balancer to the origin server cluster? By using the load balancer to distribute traffic to various nodes in the origin cluster, and then configuring the edge cluster to forward traffic to the address of this load balancer?

TRANS_BY_GPT4