Open innovia opened 6 years ago
show eye x
output when it in 18gb
Ok i had to restart it - it will happen again
here's the x info now after restart:
about: Eye v0.9.2 (c) 2012-2016 @kostya
resources: 16:19, 0%, 115Mb, <18658>
ruby: ruby 2.4.1p111 (2017-03-22 revision 58053) [x86_64-linux]
gems: ["Celluloid=0.17.3", "Celluloid::IO=0.17.3", "StateMachines=0.5.0", "NIO=2.1.0", "Timers=4.1.2", "Sigar=1.7.0.0"]
logger: /opt/redash/logs/eye.log
home: /root
dir: /var/run/eye
pid_path: /var/run/eye/pid
sock_path: /var/run/eye/sock
actors: [["Eye::ChildProcess", 33], ["Eye::Process", 7], ["Eye::Group", 1], ["Eye::Controller", 1], ["Eye::SystemResources::Cache", 1], ["Eye::Server", 1], ["Celluloid::Supervision::Service::Public", 1], ["Celluloid::IncidentReporter", 1], ["Celluloid::Notifications::Fanout", 1], ["Celluloid::Supervision::Service::Root", 1]]
btw i not think you need to monitor children for postgres or nginx, its quite strange use case.
I just wanted to see how much each of the processes are taking but now it's seems like eye is the cause of that memory leak.
Once we stablize it I'll remove these children.
here it is after a day 6GB!
root@reDash:~# eye x
about: Eye v0.9.2 (c) 2012-2016 @kostya
resources: Nov29, 0%, 6093Mb, <18658>
ruby: ruby 2.4.1p111 (2017-03-22 revision 58053) [x86_64-linux]
gems: ["Celluloid=0.17.3", "Celluloid::IO=0.17.3", "StateMachines=0.5.0", "NIO=2.1.0", "Timers=4.1.2", "Sigar=1.7.0.0"]
logger: /opt/redash/logs/eye.log
home: /root
dir: /var/run/eye
pid_path: /var/run/eye/pid
sock_path: /var/run/eye/sock
actors: [["Eye::ChildProcess", 32], ["Eye::Process", 7], ["Eye::Group", 1], ["Eye::Controller", 1], ["Eye::SystemResources::Cache", 1], ["Eye::Server", 1], ["Celluloid::Supervision::Service::Public", 1], ["Celluloid::IncidentReporter", 1], ["Celluloid::Notifications::Fanout", 1], ["Celluloid::Supervision::Service::Root", 1]]
This is really strange, i not see leaking in actors. So hard to say where it can be. Can you show full config? may be you have some global variables which not cleans. You have only 7 processes. I have eye process running for half year for 156 processes. Also there is difference in ruby, and also in gems StateMachines, NIO, Timers (which potentially can give leak).
about: Eye v0.9.1 (c) 2012-2016 @kostya
resources: Apr26, 0%, 106Mb, <32291>
ruby: ruby 1.9.3p484 (2013-11-22) [x86_64-linux]
gems: ["Celluloid=0.17.3", "Celluloid::IO=0.17.1", "StateMachines=0.4.0", "NIO=1.1.1", "Timers=4.1.1", "Sigar=1.7.0.0"]
logger: /projects/client/log/eye.log
home: /home/deploy
dir: /home/deploy/.eye
pid_path: /home/deploy/.eye/pid
sock_path: /home/deploy/.eye/sock
actors: [["Eye::Process", 156], ["Eye::Group", 29], ["Eye::Server", 1], ["Eye::SystemResources::Cache", 1], ["Eye::Controller", 1], ["Celluloid::Supervision::Service::Public", 1], ["Celluloid::IncidentReporter", 1], ["Celluloid::Notifications::Fanout", 1], ["Celluloid::Supervision::Service::Root", 1]]
im trying to reinstall the server on 16.04, will let you know
i rerun my eye with
ruby: ruby 2.2.2p95 (2015-04-13 revision 50295) [x86_64-linux]
gems: ["Celluloid=0.17.3", "Celluloid::IO=0.17.3", "StateMachines=0.5.0", "NIO=2.1.0", "Timers=4.1.2", "Sigar=1.7.0.0"]
so will see, is there any leak or not
i not see any leaks with new gems and ruby 2.2.2, so may be this is problem in config
here's my config
#!/usr/bin/env ruby
Eye.load '/etc/eye/mailer.rb' # mailer set params (like variables)
Eye.load '/etc/eye/cloudwatch.rb'
Eye.load '/etc/eye/config.rb' # config assign params values
Eye.application :reDash do
working_dir "/opt/redash/current"
load_env "/root/.env"
trigger :flapping, times: 3, within: 1.minute, retry_in: 5.minutes
notify :by_email, :info
notify :cloudwatch, :info
process(:postgresql) do
pid_file "/var/run/postgresql/9.3-main.pid"
stdall "/opt/redash/logs/postgresql"
start_command "service postgresql start"
stop_command "service postgresql stop"
restart_command "service postgresql restart"
end
process(:nginx) do
depend_on :gunicorn
pid_file "/var/run/nginx.pid"
stdall "/opt/redash/logs/nginx.log"
start_command "/usr/sbin/nginx"
stop_signals [:QUIT, 30.seconds, :TERM, 15.seconds, :KILL]
restart_command "kill -HUP {PID}"
daemonize true
end
process(:redis) do
pid_file "/var/run/redis.pid"
stdall "/opt/redash/logs/redis.log"
start_command "/usr/local/bin/redis-server /etc/redis/6379.conf"
stop_signals [:TERM, 30.seconds, :QUIT]
restart_command "kill -HUP {{PID}}"
daemonize true
end
process(:gunicorn) do
uid 'redash'
gid 'nogroup'
depend_on :redis
pid_file "/var/run/gunicorn/gunicorn.pid"
stdall "/opt/redash/logs/gunicorn.log"
start_command "gunicorn -b unix:///var/run/gunicorn/gunicorn.sock --name redash -w 9 --max-requests 1000 redash.wsgi:app"
stop_signals [:TERM, 30.seconds, :QUIT]
restart_command "kill -HUP {{PID}}"
daemonize true
monitor_children do
stop_command "kill -TERM {PID}"
check :cpu, :every => 30, :below => 80, :times => 3
check :memory, :every => 30, :below => 350.megabytes, :times => [3,5]
end
end
process(:flower) do
uid 'redash'
gid 'nogroup'
pid_file "/var/run/celery/flower.pid"
stdall "/opt/redash/logs/flower.log"
start_command "celery flower -A redash.worker --logging=debug --broker=redis://127.0.0.1:6379/0 --broker_api=redis://127.0.0.1:6379/0 --address=0.0.0.0 --port=5555 --persistent"
stop_signals [:TERM, 30.seconds, :QUIT]
restart_command "kill -HUP {{PID}}"
check :cpu, :every => 30, :below => 80, :times => 3
check :memory, :every => 30, :below => 250.megabytes, :times => [3,5]
daemonize true
end
process(:celery_scheduler) do
uid 'redash'
gid 'nogroup'
pid_file "/var/run/celery/celery_schedule_worker.pid"
stdall "/opt/redash/logs/celery_schedule_worker.log"
start_command "celery worker --app=redash.worker --beat --concurrency=4 --queues=queries,celery --maxtasksperchild=100 --events -Ofair --autoscale=10,4 -n redash_celery_scheduled@%h"
stop_signals [:TERM, 30.seconds, :QUIT]
restart_command "kill -HUP {{PID}}"
daemonize true
monitor_children do
stop_command "kill -TERM {PID}"
check :cpu, :every => 30, :below => 80, :times => 3
check :memory, :every => 30, :below => 512.megabytes, :times => [3,5]
end
end
process(:celery_worker) do
uid 'redash'
gid 'nogroup'
pid_file "/var/run/celery/celery_worker.pid"
stdall "/opt/redash/logs/celery_worker.log"
start_command "celery worker --app=redash.worker --concurrency=4 --queues=scheduled_queries --maxtasksperchild=100 --events -Ofair --autoscale=10,4 -n redash_celery_worker@%h"
stop_signals [:TERM, 30.seconds, :QUIT]
restart_command "kill -HUP {{PID}}"
daemonize true
monitor_children do
stop_command "kill -TERM {PID}"
check :cpu, :every => 30, :below => 80, :times => 3
check :memory, :every => 30, :below => 400.megabytes, :times => [3,5]
end
end
end
not see anything bad in config: i think we should try 3 things:
monitor_children
and see if leak still here.depend_on
and see if leak still here.btw using restart_command
and daemonize true
quite strange. for example unicorn also used restart_command, but it run with daemonize false
, because it potentially can create some problems with restarts.
Im trying to downgrade to 2.2.2
i'll do what you suggested
downgrading to 2.2.2 and disabling all of these dependancies and children stabilized it at 100Mb
i'll try adding back the children monitoring
how it is?
its growing
now at 700MB
eye x about: Eye v0.9.2 (c) 2012-2016 @kostya resources: Dec04, 0%, 776Mb, <1089> ruby: ruby 2.2.2p95 (2015-04-13 revision 50295) [x86_64-linux] gems: ["Celluloid=0.17.3", "Celluloid::IO=0.17.3", "StateMachines=0.5.0", "NIO=2.1.0", "Timers=4.1.2", "Sigar=1.7.0.0"] logger: /opt/redash/logs/eye.log home: /home/ubuntu dir: /var/run/eye pid_path: /var/run/eye/pid sock_path: /var/run/eye/sock actors: [["Eye::ChildProcess", 18], ["Eye::Process", 7], ["Eye::Notify::Mail", 1], ["Eye::Group", 1], ["Eye::Controller", 1], ["Eye::SystemResources::Cache", 1], ["Eye::Server", 1], ["Celluloid::Supervision::Service::Public", 1], ["Celluloid::IncidentReporter", 1], ["Celluloid::Notifications::Fanout", 1], ["Celluloid::Supervision::Service::Root", 1]]
ok, can you test separately monitor_children and depend_on, to see where bug is.
I didnt add depend_on at all
so bug in monitor_children, it remove all of it, it just works without leaks.
confirm that leak in monitor_children, i'll try to fix it.
Hi Konstantin
I have Eye 0.9.2 installed on ubuntu 14.04
the monitored processes are:
they add up to ~1500MB
but Eye is taking 18GB!
root 1087 0.5 54.3 25626424 17915500 ? Sl Nov26 26:40 eye monitoring v0.9.2 [reDash] (in /home/ubuntu)
you can see in this picture that the memory is leaking:
Any Idea how to find this memory leak?