Closed ivoabx closed 3 months ago
Hey @ivoabx π Could you please share your configuration as well? As far as I understand, you're using Laravel Octane, am I right?
Oh, sorry, I see, you attached it.
Could you please double-check, in the env, by executing ./rr workers
that RR has workers in it?
@rustatian I don't think I can exec into a Cloud Run container as it is a fully managed environment, based on kNative k8s.
I think that we have workers running ok as prior to the stall, everything working fine. In addition to that, we have logs of successful workers being spawned:
worker is allocated {"pid": 1674, "max_execs": 0, "internal_event_name": "EventWorkerConstruct"}
.
The backend is serving traffic until it stops.
If I understand correctly, you hit a super rare condition, when the following happened on your env:
RLIMIT_NPROC
, RLIMIT_NOFILE
or just PID limits. This is likely to happen in a cloud env where the limits are precisely set.How to resolve that?
ttl
, idle_ttl
. If you're using Octane, use max_worker_memory
to let RR determine, when to stop and restart the process.max_worker_memory
to a very low value. It should be the amount of memory your app generally consumes under some projected load + 10% as a buffer.Thank you very much for the detailed information, hadn't thought of the PID limits.
We are currently running with ttl:0
and idle_ttl:0
. However, the memory keeps creeping up. Is there a way to deal with that? Is it caused by the application itself or the runtime? PHP runtime is new to me, I'm used to go ;)
You may safely use max_worker_memory
and RR will gracefully stop such a memory heavy worker, so, you won't consume a lot of memory.
However, the memory keeps creeping up. Is there a way to deal with that? Is it caused by the application itself or the runtime? PHP runtime is new to me, I'm used to go ;)
RR allocates child processes, which are called workers here. Your code is in that process. RR itself consumes minimal memory. But since you're using Laravel, it is ok to see a lot of the consumed memory. So, to control that, the parameter max_worker_memory
exists which controls memory consumption per process.
Sounds great! Thanks again! Closing it as we most probably found the solution.
My pleasure π Please don't hesitate to reopen or just comment here on how you resolved this case. I guess it would be helpful for others who are searching for the similar solution π
No duplicates π₯².
What happened?
We introduced RoadRunner in our PHP project, which is currently based on Laravel 11. On local machines everything works as expected - much faster than fpm+nginx!
However, when running the workload in Google's Cloud Run for some reason it stalls from time to time. Once deployed it starts working fine and after some time(irregular intervals) it stops processing requests. All of them seem to stall. This state is accompanied by the following error messages:
failed to allocate the worker {"internal_event_name": "EventWorkerError", "error": "worker_watcher_allocate_new: WorkerAllocate: failed to spawn a worker, possible reasons: https://docs.roadrunner.dev/error-codes/allocate-timeout"}
allocate retry attempt failed {"internal_event_name": "EventWorkerError", "error": "failed to spawn a worker, possible reasons: https://docs.roadrunner.dev/error-codes/allocate-timeout"}
Locally, I'm able to reproduce it by setting the
pool.allocate_timeout
to less than 3s. This is not our configuration in CloudRun, though. We are running the default of 60s. We are also running it withttl: 0
andidle_ttl: 10
so that it does proper memory cleanup. When tryingidle_ttl: 0
, it obviously does not happen, but memory keeps creeping up.Do you have any thoughts/guidelines on why this might happen and how to fix it?
Thank you!
Regards, Ivo
Version (rr --version)
2024.1.2
How to reproduce the issue?
Deploy to Cloud Run. It will happen at some point when serving requests.
rr.txt
Relevant log output