dunglas / frankenphp

🧟 The modern PHP app server
https://frankenphp.dev
MIT License
6.64k stars 218 forks source link

.env files, OPcache and restarting FrankenPHP #457

Open ili101 opened 8 months ago

ili101 commented 8 months ago

Hi I updated my api-platform project from v3.1.14 to v3.2.10 with the new FrankenPHP instead of Caddy and I'm experiencing some stability problems after deploying it in production. I'm not sure 100% that it's all frankenphp related as it's also a new PHP and Symfony version but I suspect it is, maybe caching OPcache related?

My main problem is that I use symfony/ldap to authenticate. It works ok but so far ~every 1 to 2 days it will lock up and throw: "message": "Uncaught PHP Exception Symfony\\Component\\Ldap\\Exception\\ConnectionException: \"Can't contact LDAP server\" at Connection.php line 86", on every request until I recreate the PHP container. I know it's not the ldap server as it worked before, the port is accessible from inside the container and it goes back to working once I recreate the container. I would appreciate it if you could point me in some direction on how to troubleshoot this more since I wasn't able to reproduce the problem in my test environment.

While trying to investigate I encounter 2 other things regarding OPcache and FrankenPHP. First, looking online on how to clear the OPcache in a container the only thing I found that worked (without recreating the container) was running kill -USR2 1. On the old Caddy container it worked nicely. The edits in the php files were loaded and the container stayed up. The Caddy service for example:

/srv # ps aux -o user,group,comm,pid,ppid,pgid,etime,nice,rgroup,ruser,time,tty,vsz,sid,stat,rss,args
USER     GROUP    COMMAND          PID   PPID  PGID  ELAPSED NI    RGROUP   RUSER    TIME  TT     VSZ  SID   STAT RSS  COMMAND
root     root     caddy                1     0     1 24:43       0 root     root      0:05 ?      1.2g     1 S     56m caddy run --config /etc/caddy/Caddyfile --watch

Doing the same in the new PHP container, the frankenphp service doesn't reload the cache ignoring the signal, if I try a more aggressive kill 1 the container fails and recreated (as expected). The FrankenPHP service for example:

/srv # ps aux -o user,group,comm,pid,ppid,pgid,etime,nice,rgroup,ruser,time,tty,vsz,sid,stat,rss,args
USER     GROUP    COMMAND          PID   PPID  PGID  ELAPSED NI    RGROUP   RUSER    TIME  TT     VSZ  SID   STAT RSS  COMMAND
root     root     frankenphp           1     0     1  4:40       0 root     root      0:03 136,0  1.7g     1 S    285m frankenphp run --config /etc/caddy/Caddyfile

Can I clear the OPcache in FrankenPHP like in Caddy?

Second, in the old version, modifying the ".env" or ".env.local.php" the changes are immediately picked up (I assume there is somewhere an exception for the env files for the caching?), but in the new version, the changes are not reflected. The more bizarre part is that if I send some requests (like holding F5 for 2 seconds) the changes to the env files are partially loaded. like the system now has something like multiple versions cached so if you refresh it will loop on some pattern like updated, updated, updated, old config, old config. It loops like that on every request until recreating the container (or probably restarting FrankenPHP/OPcache but I don't know how to test it).

Help will be appreciated as I hope to avoid migrating everything back to the old version. Thank you.

withinboredom commented 8 months ago

I noticed a few points in your issue that I'd like to address individually to provide clearer assistance.

LDAP

Your query about connecting to LDAP is important, and several methods exist. It would be helpful to know which method you're using. Are you utilizing an extension, and if so, is it a custom one or the built-in PHP one, or the one that comes with FrankenPHP? It's worth noting that the built-in PHP LDAP extension isn't compatible with ZTS and is reportedly being phased out from PHP. Alternatively, if you're using the LDAP extension built into FrankenPHP, it's designed to be thread-safe, as far as I know. (I don't use it but there was #203)

Let's focus on potential solutions assuming you're using the latter and operating in worker mode. One aspect to consider is whether the LDAP library you're using automatically attempts to reconnect after losing a connection. It's possible that you might need to implement additional code to handle reconnections, or this could be an oversight in the library that needs addressing.

Opcache

Regarding your question on Opcache in worker mode: indeed, in this mode, the code is loaded once and stays in memory. PHP, to my knowledge, doesn't support hot-reloading of scripts in memory. This means that to reflect any changes in your code, you would need to restart the server process when running in worker mode.

Strange Behavior

You're correct in noting that each worker operates with its own memory/environment. This isolation can lead to situations where, if files or the environment are changed before a worker's initial startup, different workers may perceive different 'versions' of the application. It's a critical point to consider when managing workers and deploying changes.

ili101 commented 8 months ago

Thank you, so what I did is basically take the api-platform template. To Dockerfile added ldap:

RUN set -eux; \
    install-php-extensions \
        ldap \

to security.yaml added Symfony LDAP Provider and Http Basic. For authorization, I limited the access to authenticated users (the access to the "/docs"):

    access_control:
        - { path: ^/, roles: ROLE_USER }

and put roles in the entities like shown in https://api-platform.com/docs/core/security/ The only custom thing I made is this code that gets the roles but I don't see how would it corrupt the connection https://github.com/symfony/symfony/issues/51225#issuecomment-1715703896

If I understand the PR you linked we should remove all the dependency installation in https://github.com/api-platform/api-platform/blob/main/api/Dockerfile and use this "static-builder" to install them in a different way?

withinboredom commented 8 months ago

In worker mode, it is nearly 100% certain that, at some point, whatever external resource you connect to will eventually disconnect. Thus, you will need to be able to handle that error and reconnect before continuing.

If I understand the PR you linked we should remove all the dependency installation in https://github.com/api-platform/api-platform/blob/main/api/Dockerfile and use this "static-builder" to install them in a different way?

I think I misread the original issue, and the LDAP extension is thread-safe, so it should be fine.

ili101 commented 8 months ago

I added some file_put_contents to https://github.com/symfony/ldap/blob/6.4/Adapter/ExtLdap/Connection.php on bind() and __destruct() to log what happens. In old dev, old prod and new dev images I see on every request:

  1. Making a new connection.
  2. Using existing connection.
  3. Using existing connection.
  4. Dispose.

Only on the new prod image, I see on every request:

  1. Making a new connection. EDIT: Correction, this pattern will repeat (for each worker probably) then will just use the existing connection.
  2. Using existing connection.
  3. Using existing connection.

Dose __destruct() not supposed to run in this new worker mode or is something going wrong somewhere?

withinboredom commented 8 months ago

In worker mode (which you can disable, btw, and use regular ole' cgi-mode) an object exists beyond a single request. Every request shares bootstrapped resources so that, for example, your LDAP library only has to connect once per worker, instead of once per request. This has some pretty significant performance impacts, however, if your libraries aren't designed to be running on workers and are expecting to only exist for a single request ... they'll probably break (which is what I suspect is happening here).