Open Vladislavik opened 3 years ago
free -m
for example, you will see it's shared.ok, I will try 4Gb How do you think, what param I should check to know what wrong with memory eating when there is access to many different files? I have a server with 24HDD and config like this:
worker_processes auto; (48)
worker_cpu_affinity auto;
thread_pool default_pool threads=16;
events {
worker_connections 4096;
use epoll;
accept_mutex off;
multi_accept on;
worker_aio_requests 2048;
}
http{
tcp_nopush on;
tcp_nodelay on;
vod_mode local;
vod_fallback_upstream_location /fallback;
vod_last_modified 'Sun, 19 Nov 2000 08:52:00 GMT';
vod_last_modified_types *;
vod_segment_duration 20000;
vod_hls_absolute_master_urls off;
vod_hls_absolute_index_urls off;
vod_hls_container_format mpegts;
vod_hls_absolute_iframe_urls off;
vod_force_playlist_type_vod on;
vod_hls_segment_file_name_prefix Frag;
vod_open_file_thread_pool default_pool;
vod_metadata_cache metadata_cache 4098m; #was 30000m
vod_response_cache response_cache 128m;
vod_performance_counters perf_counters;
vod_output_buffer_pool 64k 32;
vod_hls_mpegts_align_frames on;
vod_hls_mpegts_interleave_frames on;
open_file_cache max=10000 inactive=2m;
open_file_cache_valid 3h;
open_file_cache_min_uses 1;
open_file_cache_errors on;
sendfile on;
sendfile_max_chunk 512k;
aio threads=default_pool;
aio_write on;
send_timeout 20s;
reset_timedout_connection on;
server {
output_buffers 1 512k;
location @m3u8 {
root /var/www/$path/;
vod hls;
}
}
}
When traffic goes to about 6Gbps to not same files mostly and 10k not so fast users, Nginx goes from regular size memory (was 30Gb) to 100% memory (256Gb), and server go to SWAP and die. Disks before Nginx do bad thing, busy about 70%
Slow pulls from the module can indeed be a problem, since the module builds the entire request in memory, without waiting for it to be pulled. In general, the recommended approach for large scale deployments is to put a CDN/caching proxies in front of this module. This way the module is not expected to get slow pulls, and once a segment is pulled, it can be served to additional users from the CDN/proxy cache.
maybe there is a way to regulate creating chunks, like read buffer size, while buffer full, not create new part of chunk, because we use CDN for popular content and not popular content gives this problems.
I don't think there's currently an elegant solution for this, you can maybe proxy these requests through another location, and have nginx buffer it to disk, or you can proxy the storage device and using nginx's rate limit there.
about this problem update: when i have many slow requests from players I see Nginx start eating ram again and it can eat all memory on the server, when I stop traffic from the server, Nginx does not free memory, it continues hold full memory. Question why after traffic stop memory still full and Nginx not free it? Only if I restart Nginx it will free up memory.
It's probably because of the behavior of the heap, I've seen it on another project - even if all malloc'ed blocks are free'ed, the process memory does not go back to what it was. On the other project, this was problematic for me, so I made sure to allocate the memory in large chunks, and used mmap/munmap instead of malloc/free, that solved it.
How to do this? I have i think another solution, when i disable keepalive from balancer to kaltura, i dont see memory eating anymore, buffers are cleaning up when connection closed only?
Can you do patch with mmap/munmap memory please, because again memory goes up and never free it, even if i remove all traffic from server and close all connections. I can clean memory only with nginx restart. Or maybe there is some bug in module that some times eat memory when many concurent connections to different files (cdn we use)
Sorry, I have no plans to implement such a patch, it doesn't make sense here... Some things you can check -
Why when i close all connection to have 0 mb/s traffic, nginx still eat memory? I think it is some memory leak exist in module.
I understand... what I wrote above still applies, start by checking the number of connections reported by nginx stub status, it can be >0 if the requests are blocked on IO
When i block all traffic to server and check status page i see this: Active connections: 1 (1 - my request /nginx_status) server accepts handled requests 587912664 587912664 967363901 Reading: 0 Writing: 1 Waiting: 0 Memory used by nginx: 70Gb and not going down, if i restart nginx it will use 5-7Gb, while huge traffic not come
This is screenshot of memory: https://ibb.co/ThvH81Q
Ok, so it's not stuck requests, that's good...
Looking again at your conf, I see you have aio threads=default_pool
, I never tried this setup, maybe try aio on
instead?
ok i will try it, but i dont know when i can tell you result, only if will again huge memory eating
No, “aio on” it not help, after 6k online users to different videos memory goes up again
Active connections: 7195 server accepts handled requests 13868982 13868982 24693531 Reading: 0 Writing: 3112 Waiting: 4067
But here you show 7k active connections, or did you mean that also in this case after it dropped back to near zero, mem usage was still high? Another thing you can try, is to configure a server with identical setup to the production server, and test with valgrind. You'll need apply the no pool patch to it (https://github.com/openresty/no-pool-nginx) and run it as single process. You can pull a list of requests from your prod server, and replay them on this test server (can use this test script - https://github.com/kaltura/nginx-vod-module/blob/master/test/uri_compare.py). When you stop nginx orderly (nginx -s stop) valgrind will report if there are any leaks. I ran this test on my environment long time ago, and no leaks were found... but maybe in your case there's a leak due to different conf / some external lib etc.
Yes after 7k connections if i again block all traffic, memory will high, no goes down.
Hi, when I use
vod_metadata_cache metadata_cache 30000m;
I see, that every Nginx process show VIRT memory 30G, I have 48 such processes and 256GB total server memory, and every of this process show same, about 30G VIRT memory.When I send to mutch traffic to this server, Nginx eat all memory on this server and go to SWAP.
Question: 1) vod_metadata_cache for all processes or for one? If for all, for what I need insert zone_name and no use it on other Nginx config location, (like limit_req_zone, when we insert it on server config and after this implement it on location) 2) Why command 'top' show 30G VIRT memory per Nginx process instead 30Gb per all processes? 3) If I want to get to all Nginx process 30Gb memory for vod_metadata_cache should I count it as vod_metadata_cache/Nginx process count