sonroyaalmerol / m3u-stream-merger-proxy

A lightweight HTTP proxy server dockerized for consolidating and streaming content from multiple IPTV M3U playlists, acting as a load balancer between provided sources.
https://hub.docker.com/r/sonroyaalmerol/m3u-stream-merger-proxy
41 stars 3 forks source link

long export time when the db is big #100

Closed aniel300 closed 2 months ago

aniel300 commented 2 months ago

this is in reference to https://github.com/sonroyaalmerol/m3u-stream-merger-proxy/issues/72#issuecomment-2299939900 I was trying to capture an interesting line of the log that day but i could not do it, but i think i just bumped into by mistake the line is this 2024/08/23 22:57:17 [DEBUG] Cache miss. Retrieving streams from Redis... not sure if this will be helpful or not.

aniel300 commented 2 months ago

here is some more logs. it looks like it is suck however i just confirmed the two groups are present in the db/m3u so we good there. now the only problem is when the the db is big image

aniel300 commented 2 months ago

please let me know if am leaking info in this screenshot and if u can a way to prevent info leak when using the debug will be awesome because manually making sure and removing the URL is a pain.

sonroyaalmerol commented 2 months ago

Can you test and see if #124 results to faster M3U returns? I've reworked the caching behind the scenes. It still should have a similar behavior where the cache needs to be built completely on first request. Also CACHE_ON_SYNC will require BASE_URL to be set as well.

BASE_URL sets the base URL for the stream URLs in the M3U file to be generated (e.g. http://192.168.1.10:8080). This data is usually gathered from the HTTP requests done by the client. However, CACHE_ON_SYNC does its thing in the background without any HTTP requests so it needs to be given the base URL manually.

aniel300 commented 2 months ago

base url is where the proxy server lives?

sonroyaalmerol commented 2 months ago

Yep! Not the base URL of the source streams.

aniel300 commented 2 months ago

ok let me test and report back

aniel300 commented 2 months ago

every time u mentions a specific pr does it means it is also available in dev tag as well?

sonroyaalmerol commented 2 months ago

Only if the PR has been merged. There are specific PRs where I try not to merge them immediately as it changes a lot of components.

aniel300 commented 2 months ago

i have tried all possible combinations such as http://172.18.0.1:8086 and http://172.18.0.1:8080 and still getting 2024/08/25 18:55:43 [DEBUG] Cache miss. Retrieving streams from Redis... in the logs how do i know the new implementation is taking effect? also SAFE_LOGS goes as SAFE_LOGS=true?

sonroyaalmerol commented 2 months ago

Well, the fact that you're seeing that log means that you're not using the image of the PR I mentioned. PRs that I mention will have a comment from a bot containing the image URL for that specific PR.

For #124, it should be this comment. Use that image URL instead of the usual sonroyaalmerol/m3u-stream-merger-proxy:dev instead to test a specific PR.

SAFE_LOGS=true is correct. However, the base urls you provided doesn't seem like the IP addresses I would expect in a local setup.

To make things simpler for you, the base URL can be derived from the URL you use to access the generated M3U.

For example, if you access the M3U with this URL: http://192.168.2.5:8080/playlist.m3u, then the base URL would be http://192.168.2.5:8080. You use that as the value for BASE_URL when trying out the PR image.

aniel300 commented 2 months ago

rest assured am trying the correct image and those ip are from the docker network on my debain 12 linux machine. i will share screenshot of everything once i get back home, thank you.

sonroyaalmerol commented 2 months ago

Sure. Do double check as it is actually impossible for that PR image to return that log line. It doesn't exist in the code of that PR.

aniel300 commented 2 months ago

current image being use and docker ip/network. image image image

sonroyaalmerol commented 2 months ago

I guess it's the workflow not doing what it's supposed not building the right image anymore in PRs for some reason. :facepalm: I'll merge it to dev. You can test it from there instead.

aniel300 commented 2 months ago

ok thanks

aniel300 commented 2 months ago

unfortunately, the issue has not been fixed. here is some screenshot of what it has been doping. db is around 300mb btw. i left it all night because i had to go to sleep. msedge_W8oaqPtISz msedge_0pcR1aO8XW

sonroyaalmerol commented 2 months ago

Can you give me more context: what makes you say it doesn't work? What is the output when you request for the M3U url?

aniel300 commented 2 months ago

it keeps loading in the browser or if i try to download the m3u via idm it never downloads. there is a little of cpu spike on the db container and proxy container (I believe) but that is about it. for example, the small channel sample am sing for debugging (~38 channels) works as expected, sync is quick and same goes for export but when u give it multiple url/providers that comes with vod, etc which all together is around 300mb in size then export doesn't seem to work. i dont the sync taking time the firs t time but the exports after that i would like to be quick.

sonroyaalmerol commented 2 months ago

I did more work for this issue. I've merged the changes to the dev build if you want to try it out again.

aniel300 commented 2 months ago

is there any way to make the db use 100 percent of the cpu so maybe the whole process can be speed up as it is doing a lot of this 2024/08/26 19:58:13 [DEBUG] Processing stream: xxx map[]} ? image

aniel300 commented 2 months ago

could the slowness be caused by the sorting? it still doing a lot of this and there is no progress bar so idk how much time is left for this to finish image

sonroyaalmerol commented 2 months ago

Please be patient and wait for everything to be processed to cache. We're talking about ~300mb worth of strings being processed. How I wish it is as easy as simply setting the "amount of CPU" to be used. That is just not how software works. Most of the process required for the proxy to work are single-threaded and cannot be parallelized.

Also, do disable debugging mode if you want the maximum performance possible. Logging affects performance more than you might expect especially in cases like these. Logging is a single-threaded process which pretty much forces even the parallelized processes to wait for the log to be printed to the terminal before the next job is executed.

The sorting is index-based and does not add any time complexity at all. Once everything is processed, the M3U will be stored in plain-text, both in memory and as a file. At that point, the only bottleneck would be your RAM speed and disk i/o.

The function for time complexity will always be directly proportional to the amount of ingress data.

sonroyaalmerol commented 2 months ago

I've just merged to dev more optimizations. This will probably be something that will improve over time as I fix other issues as a side effect. I won't be focusing on this anymore in the near future. Converting this to a discussion instead.