davidalger / capistrano-magento2

Magento 2 specific tasks for Capistrano 3
https://rubygems.org/gems/capistrano-magento2
Open Software License 3.0
228 stars 75 forks source link

Question: dealing with consumers #133

Closed valguss closed 4 years ago

valguss commented 5 years ago

Hi @davidalger

I've got a question about dealing with consumers that are started up using the magento cron. Currently, for each deployment a new set of coinsumers is started up but the old ones are not removed as the the PID in the var directory is valid for both (not linked in shared). How might we be able to deal with them so that the old consumers are killed off in favour of the new ones?

Thanks

hostep commented 5 years ago

I can elaborate a little bit on this, since I had to deal with the same problems in my own custom deployment scripts.

What we currently do with Magento 2.3.2 is reading those var/*.pid files of the previous release at the end of the deploy process, then go searching for those processes with those IDs and killing them one by one. After which the cronjob then automatically spawns new ones. Not very elegant but seems to do the job.

Magento 2.3.3 will change this a bit, and comes with a fix where they are no longer using the PID files in the var directory but use some kind of locking (probably using the database) to avoid spawning new consumers when the old ones are still running: https://github.com/magento/magento2/commit/1d9e07b218c7c8ad1f05706828cb2dd47d2d2d58. In my opinion, we should still kill/restart these after a deploy, because consumers will still use old code loaded in memory, which is not ideal. But in Magento 2.3.3 we can no longer use the var/*.pid files to identify these processes, so another way will need to be found to figure out which processes to kill.

Then some day in a future release of Magento, running consumer processes will be able to stop themselves (not always immediately) by executing some new bin/magento command (work on this hasn't started yet as far as I'm aware).

valguss commented 5 years ago

That's really useful @hostep thanks for that. I'll have a look into this, there may be a way to do a killall on the consumers then.

hostep commented 5 years ago

Hi @valguss

Played a bit with this today in preparation for Magento 2.3.3, this seems to work in our case:

Of course with a bunch of extra error handling here and there.

Not sure if this is the best way to do this and if pgrep will be available on any type of Operation System and if all the flags will work everywhere, but it seems to work on a couple of our servers running Debian 8 & 9, Ubuntu 16.04 and CentOS 7 at least.

KevinMace commented 5 years ago

We're having the same issue on a couple of projects. Thanks for sharing your knowledge @hostep

davidalger commented 4 years ago

Thanks for sharing your knowledge @hostep! I've done similar on a 2.3.3 staging environment I just spun up. Likely going to do similar on a relatively large scale 2.2.x -> 2.3.3 upgrade project I've got starting here soon where this is currently and will continue to be the deployment tool of choice.

I put the following bit into lib/capistrano/tasks/queue_consumers_kill.rake

after 'deploy:published', 'magento:queue:consumers:kill' do
  on release_roles :all do
    within release_path do
      execute :pgrep, '-u "$(whoami)" -a -f "[q]ueue:consumers:start"',
        '| tee /dev/stderr | awk \'{print $1}\' | xargs -r kill'
    end
  end
end

This is running on a CentOS 7 system where deploy is running as www-data (same user which php-fpm and crontab run as) and requires the following line be in the Capfile:

# Load custom tasks from `lib/capistrano/tasks` if you have any defined
Dir.glob('lib/capistrano/tasks/*.rake').each { |r| import r }

Example output from a deployment looks like this:

04:57 cachetool:opcache:reset
      01 cachetool -- opcache:reset
    ✔ 01 www-data@m2stage.exampleproject.com 0.324s
04:58 magento:queue:consumers:kill
      01 pgrep -u "$(whoami)" -a -f "[q]ueue:consumers:start" | tee /dev/stderr | awk '{print $1}' | xargs -r kill
      01 28730 /usr/bin/php /var/www/html/releases/20200114205737/bin/magento queue:consumers:start product_action_attribute.update --single-thread --max-messages=10000
      01 28732 /usr/bin/php /var/www/html/releases/20200114205737/bin/magento queue:consumers:start product_action_attribute.website.update --single-thread --max-messages=10000
      01 28734 /usr/bin/php /var/www/html/releases/20200114205737/bin/magento queue:consumers:start codegeneratorProcessor --single-thread --max-messages=10000
      01 28736 /usr/bin/php /var/www/html/releases/20200114205737/bin/magento queue:consumers:start exportProcessor --single-thread --max-messages=10000
    ✔ 01 www-data@m2stage.exampleproject.com 0.285s
04:59 deploy:log_revision
      01 echo "Branch develop (at 5c6bec6b5dd93145710792620a2c6bb0541c79eb) deployed as release 20200115181334 by davidalger" >> /var/www/html/revisions.log
    ✔ 01 www-data@m2stage.exampleproject.com 0.278s
valguss commented 4 years ago

Sorry, this has been in my backlog to check for ages. @davidalger, your snippet appears to work great and will be adding to my builds

Thanks for discussion around this

JosephLeedy commented 3 years ago

Is it really necessary to pipe the pgrep output through tee? I get a "Permission denied" error, and the command seems to work without it.

davidalger commented 3 years ago

It's necessary if you want to see the list of processes that were terminated in the deploy output (see my previous comment for an example).

The tee /dev/stderr basically splits the output, sending it to stdout (piped into awk) and stderr (printed to the console, bubbling up through capistrano). Honestly not sure why or how this could result in a permission error.

JosephLeedy commented 3 years ago

Thank you for the explanation, @davidalger. I think it's something weird with this particular host, as tee /proc/self/fd/2 returns the same error.