Closed myieye closed 3 weeks ago
For a start, you should force use of the main Python interpreter context as various Python packages which use C extensions do not work properly in Python sub interpreters, which can result in deadlocks or crashes. So set:
WSGIApplicationGroup %{GLOBAL}
See:
Two other reasons why you may see daemon processes crash are:
So check how much memory the daemon processes are using and whether system is getting low on memory.
Check the main Apache error log file (not virtual host specific logs), for any messages which suggest daemon processes crashed due to core dump/segmentation fault.
Thanks Graham, I'm on the same team as myieye. Thanks for your help.
I'll go ahead and set the application group. But if that was the case wouldn't the crash be consistent?
I don't think it's an OOM issue as we're running this in k8s and we have a memory limit set to 400mb, if it goes over that limit then the pod would be restarted and we would see a warning about that, I don't think we've ever had the pod restarted so I don't think it's out of memory.
Since this is a docker container based on httpd:2.4-bookworm
I'm fairly certain logrotate isn't running, it's not even installed.
Crashes when using C extension based Python packages in a sub interpreter, when they are not designed to be, need not be consistent and reproducible. All depends on what the C extension does. So try setting WSGIApplicationGroup
as explained as first step and then can see how things go.
It looks like this was an out of memory issue, we increased the memory available on that k8s pod and it seems to be working better.
Closing this issue at this point.
Note that another reason one may see the original error messages is if request-timeout
feature had been triggered. In this case you should also see messages in the logs indicating a request timeout occurred.
Hi there,
I'd be very grateful for any help someone may have to offer us on this!
We've built our own mercurial server around hgweb and are using wsgi. Usually everything works fine, but a couple times a week we run into this error:
As a result the hgweb process dies and we end up with a corrupted mercurial repository that requires some work.
Here's the gist of what we're using: Apache/2.4.58 (Unix) mod_wsgi/4.9.4 Python/3.11
And our Apache config:
My impression is that most users that encounter this error have it consistently/can't use wsgi at all, but for us it's very unpredictable and so far we haven't found a way to reproduce it yet. So, I don't expect it has anything to do with the python packages we're using and unfortunately we have no idea how to debug it.
We've now instrumented wsgi as well in the hope that that would help us uncover the root cause, but that hasn't proved to be very meaningful.
Here's a larger log snippet. As you can see we have Apache instrumented with Open Telemetry:
Lines 3 and 4 are requests to two different repos. And that seems to be a bit of a pattern. Is it at all possible that the header-buffer can be affected my simultaneous requests?
Here are some more occurences:
But this occurence doesn't follow that pattern. At least not as strictly: