Alfresco / alfresco-ansible-deployment

Ansible playbooks for deploying ACS
https://alfresco.github.io/alfresco-ansible-deployment/
Apache License 2.0
29 stars 33 forks source link

502 Bad Gateway after installing with ansible and restarting machine #417

Open amberream opened 2 years ago

amberream commented 2 years ago

Bug description

I installed Community Edition 7.2 yesterday with Ansible yesterday on Cent 0S 7. Services seemed to be up and running. I was able to create users, log in to share services, etc. But after restarting the machine I'm getting a 502 bad gateway when I visit localhost or localhost/share (the same URLs that were ok yesterday).

Is there a trick to restarting these services properly? Is there something I need to do manually? Would be nice if this was documented in the installation instructions I followed here: https://docs.alfresco.com/content-services/community/install/ansible/

When I list all services with "sudo systemctl list-unit-files" I get:

alfresco-content-monitored-startup.service static
alfresco-content.service disabled alfresco-search.service enabled alfresco-tengine-aio.service enabled

After starting the alfresco-content service manually I can get to myserver/share from the server only, but I can't see it on the rest of my network like I could before restarting the machine. I had to manually stop firewalld to fix this.

Would be nice if it was documented if I do need to start services manually. Also I noticed the docs (from the link above) list 6 services that start with "alfresco" whereas I only have 4. Are the docs out of date or is my installation incomplete?

Target OS

Cent OS 7

alxgomz commented 2 years ago

Hi Amber,

Thanks for reporting this. The Alfresco services should be automatically enabled at startup by the playbook. However, this is managed by an ansible handler. Handlers are designed in a way where those may be skipped, as explained here: https://docs.ansible.com/ansible/latest/user_guide/playbooks_error_handling.html#handlers-and-failure

Can you confirm whether your playbook failed in the middle of the role which installs the service that's not starting (tomcat judging by your previous message)?

rubisher commented 1 year ago

Hello Amer,

Can you check if nginx is well started when the system comes up : $[ ~]$ sudo systemctl status nginx ● nginx.service - The nginx HTTP and reverse proxy server Loaded: loaded (/usr/lib/systemd/system/nginx.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2022-12-29 23:18:27 GMT; 1min 24s ago Process: 1844 ExecStart=/usr/sbin/nginx (code=exited, status=0/SUCCESS) Process: 1764 ExecStartPre=/usr/sbin/nginx -t (code=exited, status=0/SUCCESS) Process: 1760 ExecStartPre=/usr/bin/rm -f /run/nginx.pid (code=exited, status=0/SUCCESS) Main PID: 1845 (nginx) Tasks: 5 (limit: 49721) Memory: 17.6M CGroup: /system.slice/nginx.service ├─1845 nginx: master process /usr/sbin/nginx ├─1846 nginx: worker process ├─1847 nginx: worker process ├─1848 nginx: worker process └─1849 nginx: worker process

I vaguely remember that this service wasn't enabled ?

Hth, Rudy

morgan-patou commented 1 year ago

This issue could be caused by several factors.

It could be firewall problem (for the localhost stuff - I opened the PR #589 which is still under discussion to have the firewall setup automatically with the playbook, if requested).

It could also be services not starting up properly because of filesystem not yet available (c.f PR #638) or because of SELinux (c.f. PR #567 and PR #634).

So without more information on what exactly is the status of the host services / components, it would be quite hard to investigate I believe. However, with the latest versions of the playbooks, this kind of issue might be less likely to happen.

jalvarezferr commented 9 months ago

I've had some deployments where the alfresco-content service was not enabled. That was caused aparently by a change in the premissions of the alfresco-content.service file:

Now owned by root:

https://github.com/Alfresco/alfresco-ansible-deployment/blob/dfdb55583a9a00abafbfb99c7ee033d65ab69bae/roles/tomcat/tasks/main.yml#L122

Previously owned by user alfresco:

https://github.com/Alfresco/alfresco-ansible-deployment/blob/2d0f3c0b395be5df9fe2cdf374339eabc857bd57/roles/tomcat/tasks/main.yml#L172

I have not reported it as an issue because, though changing back the ownership has fixed the issue for me, it really does not make much sense that this is the cause or at least the only factor. I rolled it back because it was the only difference and the playbooks did not report any trouble running the handler that enables the service. That maybe should had happened, but I saw several references of this ansible module silently failing to enable services.