ant-media / Ant-Media-Server

Ant Media Server is a live streaming engine software that provides adaptive, ultra low latency streaming by using WebRTC technology with ~0.5 seconds latency. Ant Media Server is auto-scalable and it can run on-premise or on-cloud.
https://antmedia.io
Other
4.23k stars 618 forks source link

Proposing a new method to calculate the HLS viewer count #3738

Open netaviator opened 2 years ago

netaviator commented 2 years ago

Is your feature request related to a problem? Please describe. Currently, cookies are used to identify unique users watching the live-streams within the AntMedia software. As an operator of a live-streaming cluster used by many different clients, it is not feasible anymore to create a subdomain and proxy-configuration for any of the different domains used by the clients for each new event they are coming up with. Additionally, with the current new security.standards of Chrome, Firefox and most importantly Safari, it is not possible anymore to set and use cookies while fetching data from a third-party domain which results in a wrong viewer-count shown in the back-end of AntMedia and being exposed via the API.

Describe the solution you'd like Therefore I'd like to propose some ideas for a new solution to count the viewers in the web-based live-streaming protocols. I have actually never checked if the same issue occurs while using WebRTC as the underlying protocol as we're only using HLS with our customers as latency is not key. The idea is to do the tracking similar as the guys from the german c3voc which is the kind of broadcast service streaming most of the CCC events in germany. They are using the log-files of their web-servers to sum the count of the requests to the specific playlist files (separated by stream-id and quality, source: https://github.com/voc/cm/blob/1ff191787eea12260c04b7aaf0527e01e35adf3a/ansible/roles/relay/templates/collectd/plugins/NginxHls.py) and use the source-ip of the requests as the attribute to count the unique viewers. Additionally they use a time-out calculated by "segment-count x segment-lenght x 2" to determine once a viewer is considered to not actively watching the stream anymore. In case the source-ip as a primary attribute of the viewer count is not enough, additional HTTP request fields could be added as additional attributes to identify the user (but I don't think this is really needed). By using this kind of counting-method, the tracking/session-cookie would not be needed anymore and it would be way more easier to use AntMedia in a multi-domain environment while having a more precise viewer-count for the web-based live-streaming protocols.

Describe alternatives you've considered None.

Additional context While most bigger setups include a load-balancer or CDN to distribute the traffic to the specific streaming-nodes, the real-source-ip could be determined by using the various HTTP headers offered by the CDNs like "CF-Connecting-IP" (https://support.cloudflare.com/hc/en-us/articles/200170986), "CloudFront-Viewer-Address" (https://aws.amazon.com/about-aws/whats-new/2021/10/amazon-cloudfront-client-ip-address-connection-port-header/) or "X-Forwarded-For" in case of nginx and HAproxy.

mekya commented 2 years ago

Hi @MrXermon ,

Thank you for the issue and your idea. I really appreciate that.

I've two questions for you.

netaviator commented 2 years ago

Hi @mekya,

thanks for the feedback! Regarding the issue with NAT: It’s correct that as long as only the source-up is used for counting, multiple users behind the same IP are counted as one. That’s why I mentioned the additional HTTP fields which could be taken into account (user-agent, …). Regarding the idea with the player: I‘d try to keep the feature as basic as possible. In our case we‘re using a third-party videojs player who’s is customized to the need of the customers. Therefore a player based counting mechanism would not work for us without any additional effort. Users which play the stream on their TV via AirPlay or Google Chromecast might not be counted, too, due to the device not being able to run the JavaScript code.

Hope my ideas suit you well!

mekya commented 2 years ago

Thank you @MrXermon I get the point about player based solution.

Yes, some other HTTP fields should be used to have correct numbers. User-agent is one of them. It causes problem if some users have the same browser and version. Do you know any other HTTP field that can be used?

netaviator commented 2 years ago

@mekya: I'm actually not that deep into that topic but "Accept-Language" should be good to use, too. One thing to take care of is the integration/compatibility of IPv6. In case tracking is based on the source IPs, both protocols should be considered and supported.

netaviator commented 2 years ago

Any plans on digging into the statistic topic in the next time? We're currently evaluating if we can fetch the required statistics for our customers from our CDN provider but having the same datasets for smaller events without CDN would be key.

petsoukos commented 2 years ago

Similar situation with HLS playing of live streams that are behind CDN

mekya commented 2 years ago

Thank you for follow up @MrXermon @petsoukos

Let me move it to the upper in the backlog to schedule. I mean we can research and try to find a way how to do that.

For behind the CDN scenario, could you share how to resolve it? @MrXermon. I think it will help to @petsoukos

netaviator commented 2 years ago

Sure! You have to configure the player to sent the AntMedia cookies back to the AntMedia server during each request atleast to the playlist files. In combination with CloudFront as a CDN-service, we're sending the requests to the playlist files back to the AntMedia server without caching whilst forwarding the cookies to the origin server (by using the CachingDisabled policy for example). In that setup, AntMedia correctly counts the users watching the stream by HLS. In our case, we're using a separate domain for the streaming-infrastructure compared to the website hosting the player which causes additional problems (CORS). We haven't found a way to ship around that issue, by now, except for distributing the content from the same TLD-domain which is not possible in some of our cases. Therefore we've extended our own videojs-based player to sent back viewer-statistics which works okayish in most cases.

petsoukos commented 2 years ago

@MrXermon In my CDN provider (I use BunnyCDN) I have already set a rule that sets the expire time to 0 seconds for any m3u8 files. For the forwarding stuff, I have to contact them for help. I can't find a way to do that. I also use a separate domain for the AMS instance, but it doesn't seem to interfere anymore (using latest videojs lib)

petsoukos commented 2 years ago

@MrXermon Actually, a follow up to my previous post. If I set the withCredentials: true parameter in videojs, it actually blows up and blocks me due to CORS related issues.

netaviator commented 2 years ago

@petsoukos Yeah, but without withCredentials the cookie is not attached to the requests to the playlist-files.

petsoukos commented 2 years ago

@MrXermon, ok. Then there is the wildcard * limitation. It cant be asterisk.

<init-param>
    <param-name>cors.allowed.origins</param-name>
    <param-value>*</param-value>
</init-param>

I have 2 sub-domains that need to be able to publish. The www. and a another, lets say, another.mydomain.tld

petsoukos commented 2 years ago

@MrXermon Do I need to add any actual "credentials" when setting the withCredentials: true ? I mean, I only set this parameter, but not doing anything else.

netaviator commented 2 years ago

@petsoukos No, but it does not work with „ cors.allowed.origins“ being set to „*“. I am rewriting the header with nginx to the domain requesting the content as it does not work with the wildcard, at least in the VideoJS player.

petsoukos commented 2 years ago

@MrXermon Interesting, the AMS dashboard xhr requests stopped working with 403 responses, so I added the domain of the ams instance to the cors param-value

<init-param>
    <param-name>cors.allowed.origins</param-name>
    <param-value>https://my-ams-sub.domain.tld:5443, https://www.wheremyproblemslie.tld</param-value>
</init-param>

and the UI started working again, while I also see in the response headers the following:

Access-Control-Allow-Credentials: true
Access-Control-Allow-Origin: https://my-ams-sub.domain.tld:5443

So either, I have implemented the videjs lib badly, or my CDN provider is the problem. Maybe if I can solve the issue with CDN, the HLS count will start working normally and not count 7~9 viewers per 1 actual viewer.

petsoukos commented 2 years ago

@mekya @MrXermon I made the CORS header to work in my CDN provider. I had to add custom headers for file extensions m3u8, ts, etc. Now I don't get blocked when playing back a live session using HSL. The wrong HLS viewer count still persist tho. I see anywhere between 6 to 9 HLS viewers per 1 actual viewer.

netaviator commented 2 years ago

@petsoukos Can you check if the cookie is sent currently from your browser to the CDN and from the CDN to your AntMedia instance?

petsoukos commented 2 years ago

@MrXermon From what I could figure in Firefox console, for the m3u8 request, it appears that it doesn't have any cookies.

The Request headers:

GET /streams/OMG0cABnjfRqi4lPiDiqzVmVG0Vd63ru_720p5000kbps.m3u8?token=undefined&subscriberId=undefined&subscriberCode=undefined HTTP/2
Host: my-cdn-domain.tld
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:98.0) Gecko/20100101 Firefox/98.0
Accept: */*
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Origin: http://www.myorigin.tld
Connection: keep-alive
Referer: http://www.myorigin.tld
Sec-Fetch-Dest: empty
Sec-Fetch-Mode: cors
Sec-Fetch-Site: cross-site
Sec-GPC: 1
Pragma: no-cache
Cache-Control: no-cache
TE: trailers

Also there is a tab called Cookies for each request in Firefox dev tools, which is completely empty for all m3u8/ts requests. I told you I might have botched the Videojs implementation...

initPlayer() {
            this.player = videojs("streamVideo", {
                poster: this.poster,
                posterImage: true,
                bigPlayButton: true,
                controls: true,
                liveui: true,
                liveTracker: {
                    trackingThreshold: 0,
                },
                children: {
                    controlBar: {
                        children: {
                            playToggle: true,
                            progressControl: true,
                        },
                    },
                },
                userActions: {
                    click: false,
                    doubleClick: false,
                },
                fill: true,
                html5: {
                    vhs: {
                        withCredentials: true, // either here, or see bolow
                    },
                },
            });

            videojs.Vhs.xhr.beforeRequest = function (options) {
                options.uri =
                    options.uri +
                    "?token=undefined&subscriberId=undefined&subscriberCode=undefined";
                return options;
            };
            let src = `https://my-cdn-domain.tld/streams/${this.stream_id}_adaptive.m3u8`;
            this.player.src({
                src: src,
                type: "application/x-mpegURL",
                // withCredentials: true, // can also be set here
            });
},

That's all I'm doing, it is basically a stripped down version of antmedia play.html example, maybe too stripped ?

netaviator commented 2 years ago

@petsoukos Correctly, the cookie is not sent with the request. Do you see the set-cookie on the initial request to the playlist. Some CDNs seem to strip cookies due to caching issues. Could that be the case on your end, too?

petsoukos commented 2 years ago

@MrXermon No, I see no set-cookie with the initial request. We set the Caching time to 0s for any m3u8 file extension requested via the CDN, also the CDN provider replied back to me that they forward anything in the request to the origin server if it is an uncached request.

petsoukos commented 2 years ago

@MrXermon I overrode both the Override Cache Time and Override Browser Cache Time in the CDN Rules and set them both to 0 seconds cache. Still works, still over counting. I need to investigate the videojs implementation a bit more and take a look again inside the play.html example.

petsoukos commented 2 years ago

@MrXermon I re-checked all menus and settings in the CDN panel. I found a very obvious setting that was enabled, called "Strip Response Cookies"...don't ask me how I missed it all day.

Turned it off and I see the set-cookie in the response from the initial request. Although it did not fix the wrong counting, because the subsequent requests fail to set it in the request, and every time the server responds with a new set-cookie: JSESSIONID=string; Path=/WebRTCAppEE; Secure; HttpOnly Maybe that Path= is wrong? because I "masked" it in the CDN? Is there a way to make the server set / as path? Maybe that could fix it ? I having been bothering you all day... but I'm banging my head all day at this issue

petsoukos commented 2 years ago

@MrXermon I removed the "mask" and... it seems to be fixed. I guess I have to request the stream as my-cdn-domain.tld/WebRTCAppEE/streams/-stream-id_adaptive.m3u8, but I really wan to remove the WebRTCAppEE path, maybe even the streams as well. Is there a way to make the server respond with Path=/ ? I think that would resolve all my issues, which are kinda resolved in a way

petsoukos commented 2 years ago

@MrXermon @mekya Everything seems to be working. I also had to add the SameSite=none in the AMS context.xml because browsers keep blocking cross-origin cookies from what I understand. It works in Chrome 99/100, and in Firefox (firefox keeps complaining), but as soon as you have 3rd party cookies block or open an incognito session the cookies are not set and the counter displays the wrong viewers again.

I could rewrite the CDN to use a subdomain under my domain, so instead of cdn-provider.tld => ams.mydomain.tld, it might solve the 3rd party cookies issues.

netaviator commented 2 years ago

@petsoukos Yeah, this is why I've proposed a different way of counting as there are too many factors we as a operator cannot influence (browser version, settings regarding cookies, domain (in our case)...).

petsoukos commented 2 years ago

@MrXermon Yes, it is much better to go cookieless tracking, since browser vendors are pushing this for the near future anyways. The potential solution by AMS should be flexible enough to consider users behind CDNs

mekya commented 2 years ago

Hi Guys,

Thank you for the discussion. I think as @MrXermon recommends, we can use some custom HTTP headers to count the HLS viewers.

Does it make sense? @MrXermon @petsoukos

netaviator commented 2 years ago

Hi @mekya, a custom header which has to be sent by the client will only partly work. I‘d consider to generate a hash based on values the client sends anyways and use them to match it to a viewer (like the options mentioned above). In combination with a CDN, it would make sense to use their values, additionally.

petsoukos commented 2 years ago

Hello, has there been any progress to this new feature ?

mekya commented 2 years ago

Hi @petsoukos ,

Thank you for follow up.

Unfortunately, there is no progress to this new feature. There are some high priority tasks on our end. Anyway, I'm moving to the upper level in the backlog.

Thank you

Regards, A. Oguz

patconrey commented 9 months ago

Hi! Is there any progress or an official guide to resolving these issues? I'm seeing this issue with AMS 2.5.3. I'm consuming a stream within a React app via HLS (using just the embedded web player), and the AMS dashboard is reporting 5-8x viewers. I have a pretty simple setup - no CDN, just AMS EE on AWS and SSL.

mekya commented 4 months ago

Hi @patconrey,

Sorry for late response. If you're just using the embedded player, it should work.

There should be some errors in somewhere. We can investigate this issue if you're interested

PS: Alternatively, I am thinking loudly that we may provide a way to calculate the HLS viewers by player events. We have started to support receiving player events on the server side. On the other end, it's player specific and does not work in all cases. It will likely work for our embedded player and other players that send play events.

Cheers Oguz