Open rocketsciencenerd opened 4 years ago
Thanks for reporting this issue @rocketsciencenerd . I think it could be related to running the radar-output as part of the stack and the specifications you were created based on an older version of RADAR-Docker where we run the radar-output as systemctl service with an interval. The current configuration on radar-output
RADAR_HDFS_RESTRUCTURE_OPTS: -Xms250m -Xmx4g
seems to consume (not necessarily continuously) up to 4G just for this container which may have caused OOM issue on the VM since the rest of the platform also requires some memory.
Which is why lowering the -Xmx to 2g seem to be a solution for me. Would it be possible for your system admin to check which container caused the OOM?
Hi, nivethika's suggestion sounds good.
Not sure if you already know this, but you can use docker stats
to see how much resources each container is consuming.
@yatharthranjan Is there a way to get a history of docker container usage? Looks like the stats
command displays current usage.
Hi, i am not aware of a straight forward way to do that (I use netdata for monitoring the cgroups and the VM as a whole. It can be deployed in docker itself). But there are other tools that you can use for this. A quick google search should reveal some.
CAdvisor is also relatively easy to setup (you can also use Prometheus) https://github.com/google/cadvisor https://prometheus.io/docs/guides/cadvisor/ but there are a bunch of other options too
Since recently, we've had to increase our memory requirements to 24 GB on the base system. I think it would be wise to make that the base requirement. To avoid OOM but keep running with degraded performance, you can consider to enable swap (see e.g. https://www.digitalocean.com/community/tutorials/how-to-add-swap-space-on-ubuntu-18-04)
Just to let you know that I have a similar issue on the same machine configuration (16GB RAM), some containers are hanging (kafka and HDFS). I resized to 32GB to test if it solves the issue.
I have a server running the latest radar-docker that hangs periodically with the latest update to managementportal version 0.5.8 and radar-output:0.6.0. The issue is because the VM runs out of memory and gets hung, I restart the VM everything is running but then it runs out of memory again, and then I have to restart the VM again...... This has been verified by the logs in the
/var/log/kern.log
file below:My docker container setup is below:
My vm specs match the recommended specs from https://radar-base.org/index.php/documentation/introduction/: 4-core CPU 16 GB memory An SSD for the operating system and docker (at least 50 GB) 1 x 1 TB spinning disks for redundancy
Per @nivemaham's recommendation I am going to try changing this line: https://github.com/RADAR-base/RADAR-Docker/blob/master/dcompose-stack/radar-cp-hadoop-stack/docker-compose.yml#L827
to
RADAR_HDFS_RESTRUCTURE_OPTS: -Xms250m -Xmx2g
Hope this helps others - may also be worth looking into on the master branch.