Closed hiimdoublej-swag closed 2 years ago
Thank you for reporting this. This seems to be caused by a conflict in signal handling in the Profiler agent and gevent/greenlet.
Thanks for reporting this. Could you share some code and the setup that we can use to reproduce the error?
Thanks for reporting this. Could you share some code and the setup that we can use to reproduce the error?
Sure, I'll follow up with some basic setups.
Dear @jqll : I've put together a minimal setup to reproduce the problem, please have a look here and let me know if there's anything I can provide to help you guys diagnose the issue. Thanks.
@hiimdoublej-swag thank you! We will take a look and get back to you.
gentle ping :)
Hi @hiimdoublej-swag, thanks for pinging. Sorry that I was kept busy by something else and just get some time to look into this.
To give some context, the profiler starts a daemon thread that continuously collects and uploads profiles. When it collects the CPU profile, it calls into a long running C function which collects the profile with low overhead.
The problem is that the celery app with gevent option seems can't handle tasks when another thread is calling a C function. The problem is not profiler specific. I can reproduce it by calling into a C function that just sleeps. I guess it's related to how gevent creates green threads. But I'm not a gevent expert to say if there is a way to work around this. I may be able to look more into gevent tomorrow. But there are unlikely anything that can be fixed on the profiler side. It has to call into a long running C function.
Another gentle ping :)
Seeing the same in eventlet. I suspect the problem is cloud profiler using greened modules (like threading
), thus doing its thread work in a greenthread and locking up the event loop.
Hi seizethedave@, cloud profiler does create a new thread using threading.Thread. But I think that creates an OS thread? Could you elaborate a bit if you think the threading module is the problem?
I created https://github.com/jqll/celery-c-function to reproduce that problem that celery -P gevent
can't dispatch tasks during another thread calling into a long running C++ extension function, even if that function releases GIL. I posted a question on https://groups.google.com/g/celery-users. Though I don't see it shows up in the discussion group immediately. It may need sometime to be public? I'll post here when it's visible.
Here is the question I posted on celery-users: https://groups.google.com/g/celery-users/c/_QY3cVd4tp0.
So I guess this won't be supported right ?
@hiimdoublej-swag have a look at issue
import grpc.experimental.gevent as grpc_gevent
grpc_gevent.init_gevent()
I think this may solve your problem if you can perform above snippet before spawning celery worker
Closing old issue.
Dear maintainers: I have a
celery
application that processes asynchronous tasks. I ran the worker with-P gevent
option and it would hang for 10 seconds every chunk of time with the profiler enabled. The profiler was initialized before thecelery
application comes up. When I setverbose=3
ongooglecloudprofiler.start()
, I observed that the hiccup would happen betweenSuccessfully created a CPU profile
andStarting to upload profile
. My configuration works with any of the following actions conducted:disable_cpu_profiling=True
ongooglecloudprofiler.start()
-P gevent
option oncelery
worker command.Would it be related to the
GIL
? Can some of you guys take a look at it ? If there's not enough information please feel free to ask me for it. Thanks.