Open datadavev opened 11 months ago
Thanks for this, @datadavev -- nick is out, so its hard to expand the filesystem at the moment. But we can later. Over the short term, is there anything we can clean up to gain some headroom? I see about 10GB of log data in /var/log
in 3 subdirs:
1999 apache2
4026 elasticsearch
4129 journal
Maybe those can be trimmed some? Other ideas?
Some space has been freed up and I've slowed the firehose of events in the apacheperf-1
index by excluding events where the CN is calling itself through the API. That reduces the traffic considerably to give some time for a more considered solution. The temporary fix was on cn-ucsb-1 adjust the apache config like:
#Performance logging
# don't log self
SetEnvIf Remote_Addr "128\.111\.85\.180" dontlog
LogFormat "%{%Y-%m-%d}tT%{%T}t.%{msec_frac}t%{%z}t|%m|%>s|%{ms}T|%a|%U|\"%q\"|%{cache-status}e|\"%{User-agent}i\"|%u" performance_log
CustomLog "/var/log/apache2/cn_perf.log" performance_log env=!dontlog
This is just a temporary fix to slow the deluge of events. Thing is, I'm not sure this information is needed for the current metrics processing - reviewing code and logstash configuration...
Disk usage on
logproc-stage-ucsb-1.test.dataone.org
is running close to 95%. By default, Elastic Search puts itself into read only mode when disk capacity reaches 95% full to avoid errors and complications when disks are full.The method for recovery is to reduce disk usage and issue the command:
The vast majority of disk use is with the
apacheperf-1
index currently at around 720gb, followed byeventlog-1
at around 144gb.