Open xmunoz opened 3 years ago
This was not fixed by my code change. We need to manually monitor the number of daemons running upon deploy to prevent issues.
Dec 08 21:54:15 taskrunner-staging systemd[1]: Starting SQS Daemon...
Dec 08 21:54:15 taskrunner-staging rm[9731]: /bin/rm: cannot remove '/data/tmp/sqs.pid': No such file or directory
Dec 08 21:54:15 taskrunner-staging systemd[1]: sqs-daemon.service: Found left-over process 704 (php) in control group while starting unit. Ignoring.
Dec 08 21:54:15 taskrunner-staging systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
After an EC2 instance is created from an AMI, it should have all of the default daemons running as a single process. After a deploy happens, the deploy script sets the daemon status to "restarted", which has the effect of creating a new process without killing the old one. I have also observed this behavior on the servers after manually executing
sudo service sqs-daemon restart
manually.This is probably some underlying issue with how the daemon is configured. In the absence of an easy fix, simply add another Ansible task to stop the service and then start it.