FrankenPHP serving stale requests #881

Open krystof018 opened 3 months ago

krystof018 commented 3 months ago

What happened?

Sometimes franken php is serving stale requests with Symfony. It returns random previous requests no matter what they contain (errors included). I'm not sure when this happens but it seems to me that it happens after multiple 5xx responses (but not everytime). Only thing that helps is restarting the docker container. I don't think it's Symfony issue because it was working just fine on fastcgi but it's true I migrated the project to Symfony 7 around the same time. I understand that this description isn't much you can work with but I hope I'm not the only one its happens to and that someone will have more insight/understanding why is this happening.

This happens no matter the symfony environment is running.

This is my Caddyfile

    # Debug
    frankenphp {
        worker /srv/public/index.php
http://localhost {
    # Enable access logging (to the console)
    route {
        root * public/

https://localhost {
    tls internal

    # Enable access logging (to the console)

    route {
        root * public/

I'm starting my caddy server with frankenphp run --config /etc/caddy/Caddyfile --adapter caddyfile

The app is Symfony 7 with API Platform, nothing else is special about it.

Relevant log output

No response

withinboredom commented 3 months ago

I'm pretty sure FrankenPHP isn't keeping old requests in memory anymore (we fixed that memory leak awhile ago), but even then it wasn't returning them to the wrong connection, just keeping them around in memory.

The only thing I can think of is threading shenanigans between PHP & go, or a bug in symfony (the more likely option, TBH), which could be an issue in either one (no idea without a way of reproducing it). It'd be great if you could figure out which of these is true:

  1. concurrent responses are being sent to the wrong connection
  2. old responses are being sent again
withinboredom commented 3 months ago

TBH, the most likely issue is the request/response is being stored in the container and not regenerated for every request. This would mean that if something were to take the request/response from the container, it would be a mismash between all the other requests/responses that did so.

dunglas commented 3 months ago

This can (and likely is) also be an application bug. Like a global state that isn't reset between requests.

EasyAdmin was suffering of a bug like this for instance.

dunglas commented 2 months ago

Closing for now as there is nothing we can do without more information. See this EasyAdmin path, which fixes a (likely) similar issue. If you think that it's not an issue in your app, feel free to reopen.

aleho commented 2 days ago

I can confirm this issue for a very simple setup here.

The assumption for me was that the reset interface should be enough to reset the context state. For good measure I also tagged the service with KernelEvents::FINISH_REQUEST. Yet after first visiting the page and changing data, some responses contained the correct, new data, some responses didn't.

I then completely eliminated the local variable and cache, always fetching the data from API. And still, some responses contain the old content, some the new. API responses are never cached, I also confirmed by completely flushing Redis that there's no way stale data could be fetched (even though caching was disabled completely).

EDIT: Oh, and everything, caching and local variable, works perfectly fine if I disable worker mode. EDIT2: Forgot to mention, the response(s) containing wrong data are predictable. If one response was bad, it was always delivered after 7 good responses. Even after reloading for 30 times the 8th response was bad.

dunglas commented 2 days ago

This is probably a bug in the Symfony Cache. If you have a simple reproducer, could you publish it? I'll take a look.

withinboredom commented 2 days ago

If one response was bad, it was always delivered after 7 good responses. Even after reloading for 30 times the 8th response was bad.

This is likely due to the workers. ie, you have eight workers and they deliver responses in round-robin order, so you end up with one "corrupted" worker.

If you are up for digging deeper, search for accesses to static variables (this may even include properties in your objects in your service container, as most services are not 'volatile', meaning they exist beyond a single request). Most likely there is a static variable that needs to be cleared/reset and isn't being reset on some code path.

aleho commented 2 days ago

This is probably a bug in the Symfony Cache. If you have a simple reproducer, could you publish it? I'll take a look.

I completely eliminated everything Symfony related, except for a REST API call (not even some obscure Doctrine caching was possible).

This is likely due to the workers. ie, you have eight workers and they deliver responses in round-robin order, so you end up with one "corrupted" worker.

That's exactly what I thought and the reason I wanted to mention it.

If you are up for digging deeper, search for accesses to static variables (this may even include properties in your objects in your service container, as most services are not 'volatile', meaning they exist beyond a single request). Most likely there is a static variable that needs to be cleared/reset and isn't being reset on some code path.

There's not a single static variable anywhere. I was able to reproduce the problem even after changing the code to do an API call with every access to the getter (instead of saving the context locally). The object returned is an instance created from JSON, no static variables, etc. It's a shared Symfony service, but completely stateless.

I'll try to come up with a reproducer as soon as possible.

dunglas commented 1 day ago

Would you be able to share your project? A reproducer will help a lot to find the underlying issue.

This is interesting that it occurs after 500 errors. The Symfony error handling system is quite complex. Maybe some states aren't properly reset when it is triggered?

aleho commented 1 day ago

Unfortunately I can't share that project.

I'll try to write a reproducer in a pulic repo based on symfony/docker though. This way either I'll find whats wrong with my project and report back, or I'll have a reproducer ready that hopefull helps to narrow down the problem.