51Degrees / device-detection-php-onpremise

On-premise implementation of the 51Degrees Device Detection engines for the Pipeline API
Other
1 stars 3 forks source link

Unable to persistently start using new data file with php-fpm #8

Closed thias closed 2 years ago

thias commented 2 years ago

This is somewhat related to #2 and is something I had already reported directly to support.

We are using php-fpm, and the module and data file get loaded at startup. From there I've been told to use the refreshData() function to start using the new data after the data file gets updated.

But this doesn't work as expected with php-fpm, since it's isolated to the currently php-fpm forked process (we have up to 1500 per server) and doesn't get applied to the master php-fpm process. So even somehow managing to run refreshData() from within all running processes wouldn't be enough.

This makes it virtually impossible to use updated data files from php-fpm without having to fully restart the whole service, which is not something acceptable in most production environments, ours included.

Having a least the master process periodically check if the data file has changed could be enough, as all child php-fpm processes would eventually start using the new file once they reach their maximum executions and get replaced.

These are my original findings (the concurrency has since been made configurable, and the default of 10 can be lowered to 1):

Hi,

I finally managed to revisit the data reloading in PHP topic, and my
findings are actually a lot worse than what I was expecting.

In our web server environments, we use php-fpm with nginx, which is
quite common. In some setups we have multiple different php-fpm process
pools on the same server, running as different users. So I had held off
this reloading, since it wasn't going to be as easy as a single call
per server.

Today I finished creating something flexible where we have a different
nginx URL to run the refreshData() through each existing php-fpm
backend, and we automatically call all of them when we get a new data
file. This part works. And I though I was done...

But while testing end-to-end that things were working, I uncovered an
unexpected behavior: When refreshData() is called, it seems to affect
only the single php-fpm process that executed it. Not all of the
separate pools (I was expecting that limitation), but not even all
processes from the same pool.

I tested with a php-fpm pool of 8 processes:

* I see that each process has 10 file descriptors open to the
  Enterprise-HashV41.hash file (that's a lot, but whatever).
* I move the Enterprise-HashV41.hash file to /tmp/
* I see that the open files are now from the /tmp/ file as expected.
* I put a new Enterprise-HashV41.hash file in the original location.
* I call refreshData() from a PHP script.

This is when I see that of the 8 processes, only 1 has reopened the new
hash file from the original location. The 7 others still have it open
from /tmp/.

And the worst is yet to come. I then set a very low "pm.max_requests"
for the php-fpm backend. This is the number of PHP executions from a
process before the master process kills it and forks a new one in its
place (typically to mitigate any possible memory leaks). I was sort of
expecting to leverage this existing behavior to get new processes to
start using the updated file from the original location.

But the behavior is that even if I've executed refreshData() from all
of my running php-fpm processes from the pool, essentially having all
of them using the new data, once any gets replaced, the new process now
reverts to having the /tmp/ file open!!(??). This is when I checked and
realized that the php-fpm master process has the /tmp/ file open (the
original from when it started, but moved since). It must be passing on
its file descriptor to the newly forked childs...

So unless I'm mistaken, all this essentially means that the current
behavior of the refreshData() function is totally useless when using
php-fpm, as it will only manage to partially and temporarily use the
new data.

Could someone please double check my findings and advise on how to
actually have php-fpm properly pick up new files? Fully restarting
php-fpm isn't an option for us (and isn't necessary for any of the
other lookups we use, such as the mmdb ones).

Cheers,
Matthias
tungntpham commented 2 years ago

Hi @thias ,

Many thanks for raising this issue with us. We have identified this as a restriction when using our Detection module under a process manager such as Apache MPM or php-fpm. When the Device Detection engine module is loaded in the main process, a pool of the data file handles is created. So when the process manager spawns child processes in response to incoming requests by forking the main process, this pool is copied to the child processes. As you have observed, when calling refreshData() in a child process, only the copied file handles of that child process is updated but not the parent process’s pool.

For now the only way to load the new data file is to restart the php-fpm service. One approach to help with restarting your service in production environment with a minimum impact is to maintain a production and a deployment slot so that you can start a new service with a new data file in a deployment slot. When the deployment slot is fully ready, swap it with production slot. A further remedy could be done is to maintain multiple production nodes and use a traffic manager to direct the requests to other nodes while the swap is being performed.

Also please make sure to use MaxPerformance profile. We designed our API to allow different performance profiles, to offer the user the ability to trade off between performance and memory usage, while also scaling well in highly concurrent environments. In other profiles such as Balanced, BalancedTemp, LowMemory or Default, only a part of the data file is loaded into memory so from time to time, calls to read data from disk are required. This call uses file handles from the file handle pool mentioned above, which was designed to optimize performance. Since the file handles pool is copied to the child processes, this causes the file handles to be shared. When multiple processes call to load data from a file using shared file handles, problems could occur such as file position being changed unwantedly by other child processes.

Moving forward, we will revisit the design of our Detection module and investigate changes required to help with these limitations.

Kind regards, Tung.

tungntpham commented 2 years ago

Closed as this has been identified as a restriction to certain working environments. Improvement will be invested but not at this point.