Closed majamassarini closed 7 months ago
Build failed. https://softwarefactory-project.io/zuul/t/packit-service/buildset/f1cc474cece943c4bdfd2ea358448fad
:heavy_check_mark: pre-commit SUCCESS in 1m 44s :x: deployment-tests RETRY_LIMIT in 7m 27s
Build failed. https://softwarefactory-project.io/zuul/t/packit-service/buildset/e8de33af4b3f469e9ea99bc7984e4b27
:heavy_check_mark: pre-commit SUCCESS in 1m 46s :x: deployment-tests RETRY_LIMIT in 7m 29s
I would probably recommend backup of the stage database and dumping the production one in its place, cause I have doubts about reproducibility of this issue on the “small-scale” contents of the stage's database.
Luckily it is really easy to check. I just need to enter the postgres pod with rsh
and run df -h /dev/shm
.
If the size is 64MB then we still have the problem and the PR is useless and we should decide for one of the other two solutions.
I would say that one of the options would be increasing the memory (though we don't have much left… and we're just running 2/2 workers; we have one more worker than usual for
short-running
, but it's still less than what I set up for the redis upgrade a month ago or so).
The problem seems not to be related with memory in general. My local pods had plenty of memory and until the shared memory was 64MB the exception occurred. If we are not able to increase the shared memory we can not solve it, I fear.
- redesign db queries in usage page
that would be probably ideal as it is the only API endpoint causing issues; IMO it would be probably for better to go with raw SQL queries instead of ORM which adds complexity because of the abstractions present…
I remember having already worked on the raw queries once, at the time we decided to stay with the ORM. But if we have no other solutions, probably this will be the easiest way. And probably for the usage pages we can also use views
...
This PR was not working because the deploy
target is not mounting the created volume. If you mount manually the volume then the shm is resized.
By default shm size is 64MB, dashboard usage page uses around 100MB.
I can reproduce the error described in packit/packit-service#2385 locally using podman compose and I can fix it by setting
shm_size: 128Mi
indocker-compose.yml
.I tried deploying these openshift changes on my local openshift cluster but even though the new volume is created and mapped the size didn't change.
df -h /dev/shm
is always 64MBI would like to test this on stg, to be sure the problem is not related with my local openshift cluster.
But, as far as I can understand it, does exist a
SizeMemoryBackedVolumes
k8s feature gate, which is not enabled in the cluster by default (probably because this is a dangerous feature https://github.com/kubernetes/kubernetes/issues/119611). If this is not enabled in the cluster, resizing can not occur.If this PR will not work on stg I think we have 2 solutions: