Closed nchudleigh closed 5 years ago
Build for 18.04 is stalling on Reload all conf files
step
Debugging locally
Vagrant build works locally, going to run against Docker (builded) target next to see if that is the issue
Local run of ubuntu:18.04-builded
image against https://github.com/ANXS/postgresql/releases/tag/v1.10.1 ran successfully
Testing against master next
Local run of ubuntu:18.04-builded
against current master
was successful
@gclough At this point I am out of ideas, locally the tests run cleanly on the latest code. Perhaps it is something to do with the travis environment. I tried pulling the builder image used to build the test environment travis-ci-garnet-trusty-1512502259-986baf0
but was denied access to it.
If you have any ideas on where to look next I would really appreciate a pointer.
Going to leave off here for the day and come back with fresh eyes tomorrow
@gclough Really need some help on this one! All detail is given in previous comment.
@nchudleigh I've re-triggered the failed build, as sometimes Travis hangs unexpectedly. If it fails again, then I'll have to review it more in depth.
:-/ Same problem. I can't understand why it's hanging during a parameter reload.
TASK [ANXS.postgresql : PostgreSQL | Reload all conf files] ********************
No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.
Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received
The build has been terminated
The code it's running is pretty benign:
- name: PostgreSQL | Reload all conf files
service:
name: "{{ postgresql_service_name }}"
state: reloaded
when: postgresql_configuration_pt1.changed or postgresql_configuration_pt2.changed or postgresql_configuration_pt3.changed or postgresql_systemd_custom_conf.changed
I've never seen a PostgreSQL database hang on a config reload. If someone can provide advice on how to capture the logs before Travis terminates the server, I'd appreciate it.
I think I am also running into this problem (in both Unbuntu 18 and Debian 9) I noticed that I have to manually run pg_ctlcluster to restart.
root@lnx-14:/var# pg_lsclusters
Ver Cluster Port Status Owner Data directory Log file
11 main 14544 online postgres /var/lib/postgresql/11/main /var/log/postgresql/postgresql-11-main.log
root@lnx-14:/var# pg_c
pg_config pg_conftool pg_createcluster pg_ctlcluster
root@lnx-14:/var# pg_ctlcluster 11 main restart
root@lnx-14:/var#
@vladp Thanks for the suggestion, will have to give that a shot
Based on the work in https://github.com/ANXS/postgresql/pull/384 by @irionr
A couple things to call out here:
Vagrant IPs
I have used
192.168.88.21
for the 1804 inventory rather than shifting all the IPs of the other vagrant images. I am not sure how much we care about the order of these things or if 21 is reserved for something else. Happy to change this.Vagrant Image
The
builded
image for18.04
works, I have run it locally with no issue. But I am unsure whether or not it is required. It has just been copied from the16.04
image and may need updates but we will see as this hits the test suite.Vars
I noticed there is a
vars/xenial.yml
file that makes a few changes for Postgis packages, though I could not see it being used anywhere in the role (seems like Debian.yml is used). In the event that this is needed for 18.04 I can create abionic.yml
. Deferring to maintainers for guidance on this.