Baldinof / roadrunner-bundle

A RoadRunner worker integrated in your Symfony app
MIT License
270 stars 48 forks source link

Kernel terminate event some times doesn't get fired? (web profiler) #74

Closed nocive closed 2 years ago

nocive commented 2 years ago

Hi all :wave:

I have a symfony 5.4 application using the symfony runtime component and this bundle.

The application itself seems to be working (based on some rough testing done so far) but I'm facing a weird problem with the symfony developer tools, more specifically, the web profiler & web developer toolbar.

I'm observing a situation where the kernel terminate event doesn't seem to be fired at all, which then causes a profiler report not to be saved to the profiler storage, consequently resulting in a permanent 404 when attempting to fetch the /_wdt/{token} url via xhr.

I've added some simple debug statements to the Symfony\Component\HttpKernel\EventListener\ProfilerListener class which allowed me to confirm that the onKernelTerminate function in this class is only called some times. The listener for this event has a priority of -1024, but changing the priority didn't seem to make any difference. I have tried different kernel reboot strategies, including always, which also didn't make any difference.

I haven't been able to figure out more about this issue but I'm wondering if anyone in the community has any more insights about what could be going on here. Some tips about where to further look would also be appreciated.

I'm using symfony:5.4.4, roadrunner-bundle:2.2.0 and roadrunner:2.8.2

Thanks!

References:

Baldinof commented 2 years ago

Hi! Thank you for trying out this bundle :)

It work for me, can you provide a reproducer repository?

The code that calls terminate is here, I tried to had some echo here and it's called. Is it on your side?

nocive commented 2 years ago

Hi @Baldinof :wave:

Thanks for the quick reply and thank you for making this bundle ;)

The terminate call that you pointed out also only happens sometimes, and the times that it gets called the ProfilerListener::onKernelTerminate also gets called (as one would expect). So the problem does indeed seem to stem from the fact that in some cases this piece of code doesn't run at all.

I placed 3 debug statements in the handle function and sent a unique uuid header with curl, like so:

file_put_contents('/tmp/debug.log', $request->headers->get('x-uid') . ': before instanceof TerminableInterface' . PHP_EOL, FILE_APPEND);
file_put_contents('/tmp/debug.log', $request->headers->get('x-uid') . ': is instanceof TerminableInterface: ' . var_export($this->kernel instanceof TerminableInterface, true). PHP_EOL, FILE_APPEND);
if ($this->kernel instanceof TerminableInterface) {
        file_put_contents('/tmp/debug.log', $request->headers->get('x-uid') . ': calling kernel->terminate' . PHP_EOL, FILE_APPEND);
        $this->kernel->terminate($symfonyRequest, $symfonyResponse);
}

This is what a normal looking request produces

a5303634-4a9b-49df-8aef-7c6aa4b0d988: before instanceof TerminableInterface
a5303634-4a9b-49df-8aef-7c6aa4b0d988: is instanceof TerminableInterface: true
a5303634-4a9b-49df-8aef-7c6aa4b0d988: calling kernel->terminate

This is what "bad" requests produce:

9be52d23-c70f-4d25-b7cd-32f60846ce5b: before instanceof TerminableInterface

0610eb60-d3d1-439f-aac2-7ed2c1a29af8: before instanceof TerminableInterface

a3c76b0b-aade-47d7-9ec0-86bf79e0a365: before instanceof TerminableInterface

And some times, nothing is produced to the debug log at all. In all cases a response is still received and the app seemingly works because it doesn't really do anything with the kernel terminate event.

I will try to provide a reproduction scenario once I find some time, but I suspect that I won't be able to reproduce it and that the issue we're seeing here is caused by our own app code somehow. In any case, it does seem to indicate at the possibility of a bug, since I would expect the kernel terminate event to happen regardless of "bad" application code.

Baldinof commented 2 years ago

Are you in some way overriding the kernel service?

What's the result of bin/console debug:container kernel?

nocive commented 2 years ago
Information for Service "kernel"
================================

 ---------------- ------------------------------ 
  Option           Value                         
 ---------------- ------------------------------ 
  Service ID       kernel                        
  Class            App\Kernel                    
  Tags             controller.service_arguments  
                   routing.route_loader          
  Public           yes                           
  Synthetic        yes                           
  Lazy             no                            
  Shared           yes                           
  Abstract         no                            
  Autowired        no                            
  Autoconfigured   yes                           
 ---------------- ------------------------------
Baldinof commented 2 years ago

That's strange, do you have the same problem if you reduce RR to only one worker?

With config http.pool.num_workers: 1 in .rr.yaml

nocive commented 2 years ago

Yes, that doesn't seem to make any difference either. For the record, here's my rr configuration:

version: '2.7'

server:
  command: "php public/index.php"
  env:
    - APP_RUNTIME: Baldinof\RoadRunnerBundle\Runtime\Runtime

http:
  address: 0.0.0.0:80
  middleware: ["static", "gzip"]
  pool:
    num_workers: 1
    max_jobs: 1
    supervisor:
      ttl: 0
      max_worker_memory: 16
  uploads:
    forbid: [".php", ".exe", ".bat"]
  static:
    dir: "public"
    forbid: [".php", ".htaccess"]

logs:
  mode: development
  channels:
    http:
      level: debug # Log all http requests, set to info to disable
    server:
      level: info # Everything written to worker stderr is logged
      mode: raw
    metrics:
      level: debug

reload:
  enabled: true
  interval: 1s
  patterns: [".php", ".yaml"]
  services:
    http:
      dirs: ["."]
      recursive: true

PHP version is 8.0.16

nocive commented 2 years ago

Ok... I think I cracked the case... :man_facepalming: It was a memory problem, not enough memory was allocated. Once I bumped the max_worker_memory, the problem went away. The real issue then is that somehow the php fatal error is completely swallowed by roadrunner. Is this a known bug upstream?

Baldinof commented 2 years ago

Wow, I would not have found this one :D

I tried this out and indeed, nothing is logged. If you change the log level logs.channels.server.level: debug you get something like this:

memory_limit    {"reason": "max memory is reached", "pid": 40466, "internal_event_name": "EventMaxMemory"}
worker is allocated {"pid": 40471, "internal_event_name": "EventWorkerConstruct"}
memory_limit    {"reason": "max memory is reached", "pid": 40471, "internal_event_name": "EventMaxMemory"}
worker is allocated {"pid": 40476, "internal_event_name": "EventWorkerConstruct"}
memory_limit    {"reason": "max memory is reached", "pid": 40476, "internal_event_name": "EventMaxMemory"}
worker is allocated {"pid": 40481, "internal_event_name": "EventWorkerConstruct"}
memory_limit    {"reason": "max memory is reached", "pid": 40481, "internal_event_name": "EventMaxMemory"}
worker is allocated {"pid": 40489, "internal_event_name": "EventWorkerConstruct"}
^Cstop signal received, grace timeout is: 30 seconds
worker destroyed    {"pid": 40489, "internal_event_name": "EventWorkerDestruct"}

I think the process got killed by RR and PHP does not encounter a fatal at this point. RR seems to wait the end of the request and immediatly kill the process after (which got killed before calling terminate())

nocive commented 2 years ago

The problematic setting is not actually max_worker_memory but rather max_jobs. Once I removed that, things seem to work normally, apart from some memory leaked which I tend to blame my app for :)

nocive commented 2 years ago

I can no longer reproduce this issue with the latest versions of roadrunner and this bundle. Closing.