Open dominics opened 11 months ago
I'd strace
it for you, to see what it's doing with all that CPU usage, but I'm on Mac OS right now (SIP makes that sort of thing pretty hard these days). I'll try to repro on a Linux machine soon if you don't get to it first.
All the debug logs are silent about where the usage is going (particularly pubsub-debug.log
offers no clues)
Are there any news on this issue? We are experiencing exactly the same.
In our case this is happening while running test suites. While running a test suite, many topics and subscriptions are created. When the test suite is finished, the topics and subscriptions are removed (after purging them), and the next test suite is started with a similar behaviour. We have many thousand of tests and many hundred of test suites running in parallel (in batches of parallel test suites 5-6 of course).
Then after letting the whole run for a while, the cpu leak happens We stop the run of the tests, and the only thing we can observe in the emulator logs are messages like the following:
com.google.cloud.pubsub.testing.v1.PushLoop push
INFO: Failed to push to endpoint 'http://localhost:62099/projects/aaaaaaa/topics/bbbbbb/function/cccccccc: Connection refused
for each of the subscriptions that were created inside the different test scenarios, but now don't exist anymore because they were destroyed by the teardown of the test suite. It seems that although we purged the subscription before deleting it, there were still some events flying in the emulator, and now they can not be delivered.
After letting many hundred of test suites run, the number of events that can not be delivered because the corresponding test suites are already done and the subscriptions where removed is so high, that the log of the emulator is bloated with this kind of errors, and we wonder if this could be related with the cpu leak.
Regards.
Synopsis: When the PubSub emulator is running, and the "background triggers" (for Functions) are toggled off and on, the PubSub emulator starts to consume more and more CPU, until it is eventually consuming all available CPU.
[REQUIRED] Environment info
firebase-tools: 13.0.0 Also tested with 12.4.7
Platform: MacOS
[REQUIRED] Test case
Initial project setup:
Set up some functions,
functions/src/index.ts
looks like this:This is a minimal setup, but I first noticed this problem on a real, larger project. (And I think it's easier to cause to happen the more functions there are defined).
[REQUIRED] Steps to reproduce
Start the emulators:
We're looking for at least the PubSub and Functions emulators to be started. The setup described above will also start the Firestore emulator.
Now, we want to turn background triggers on and off repeatedly:
Modify if you're using a different Emulator Hub address or port. Let this run for a while. The functions will be reinitializing with each run:
With only these two functions defined, you may need to let this script run for a few minutes (at least a hundred or two hundred toggles) before the actual behavior described below becomes very visible. But with many different PubSub functions defined (e.g. in a real project) the behavior is much more visible, because it happens with far fewer toggles.
[REQUIRED] Expected behavior
Monitor the CPU usage of the PubSub emulator. We'd expect it to be using some small amount of CPU to reinitialize (because each reinitialization of the functions might be creating a topic?)
[REQUIRED] Actual behavior
Monitoring the CPU usage of the PubSub emulator, what we actually see is it using more and more CPU, even if we pause or stop the script that is toggling the background triggers.
If you continue to toggle the background triggers, eventually the PubSub emulator will consume all available CPU on the machine. This can happen quite rapidly with many PubSub triggered functions defined.
This additional CPU usage never goes away e.g. if you completely stop the script that is toggling the background triggers, the CPU usage of the PubSub emulator does not ever go back down to idle levels of CPU usage. (e.g. it will continue to use 200% or 400%)
Notes