Open damianhorna opened 3 years ago
This crash(139) is usually due to resource shortage, e.g. memory. For numbers on notifications, you can have a look on our loadtesting repository: https://github.com/FIWARE/load-tests/ We publish numbers for two different notification scenarios, you can find them for example here: https://fiware.github.io/load-tests/testReports/orion-ld/tiny/reports/ld/EntityUpdateWithSubscriptionSimulation/gatling-report.html
Thanks @wistefan for your response!
I actually had some time today to monitor the usage of different resources with docker stats
for various containers.
I do not limit the RAM or CPU usage in the docker settings, so theoretically all host's resources are available for the containers to consume.
From what I understand, the usage of memory is pretty consistent no matter the load (for the conditions which I tested) and it is around 0.11% for mongo container and 0.03% for orion-ld (I have a 32GB of RAM on my machine). The only thing which I noticed is the change of CPU usage depending on load.
The bigger the load, the higher the CPU usage of orion-ld and mongo (obviously), and the quicker the app crashes. However, it still crashes even if the load is relatively low (like 1% of CPU for both mongo and orion), but only after a longer period of time.
When I simulate load in the 0.02s intervals, the app crashes after 1-3 mins When I simulate load in the 0.1s or 0.5s intervals, the app crashes after longer intervals - from 5 to 30 mins, hard to predict.
What is the best way to find out what is the exact cause of the crash (exit code 139)?
As mentioned, this didn't happen for the regular fiware/orion.
Thanks!
If the crash is because of the broker itself and not the container, try starting the broker inside gdb. However, it kind of seems like it's a "container related" crash.
Hi,
I run orion-ld with docker-compose like this:
And then I create entities, subscriptions and start generating data. The load is pretty heavy (like several notifications every 0.02s in multiple processes), and the orion-ld crashes after some nondeterministic amount of time with the code 139:
The problem also happens if I decrease the load to let's say several notifications every 0.1 s (within multiple processes), but after a longer period of time.
It seems like the issue is mongo-related. This is strange though, because I didn't have similar issues when using the original fiware/orion.
Any advice on how to solve this problem would be much appreciated!