Open mykaul opened 2 years ago
We have the same amount as before (2G), increasing would mean reinstalling. I see that the munin graphs are not working, so I would need to fix that first to see if anything is wrong.
But here, isn't the "cannot allocate memory" a generic error message ? I see no out of memory error on that builder.
We have the same amount as before (2G), increasing would mean reinstalling. I see that the munin graphs are not working, so I would need to fix that first to see if anything is wrong.
But here, isn't the "cannot allocate memory" a generic error message ? I see no out of memory error on that builder.
I did not think it comes from Gluster, and 2G seems quite low to me for tests. I wonder if we can add some metrics to ensure we don't hit something from time to time. Is swap enabled on the hosts?
I think since cat is trying to be run on a file that is backed by a FUSE filesystem, the "cannot allocate memory" message could be a symptom of a underlying error (like FUSE returning a error code where cat do not expect one, or something like that).
2G is low, but that was more than enough when we sized the builders back in the day, nothing required more than 1G to run on the tests, and we increased to 2G for dnf/yum/etc.
We have no swap on the builders for the moment.
I think we should first look at the graph (so I should fix), see what is needed , and then reinstall, add swap or fix once we isolate the problem.
Is the graph fixed now?
So they were fixed , and they broke again:
https://munin.gluster.org/munin/aws.gluster.org/builder-c7-3.aws.gluster.org/index.html
I need to fix again :/
However, we have the graph for month: https://munin.gluster.org/munin/aws.gluster.org/builder-c7-3.aws.gluster.org/memory.html
And while the colors are a bit messy, we can see that the majority of the memory is used by the fs cache.
From https://build.gluster.org/job/gh_centos7-regression/2751/consoleFull (which may be a legit failure?):
In the 2nd attempt, the test passed.
Do we have enough memory on our nodes?