Closed ZPascal closed 5 months ago
We have created an issue in Pivotal Tracker to manage this:
https://www.pivotaltracker.com/story/show/186569819
The labels on this github issue will be updated when the story is started.
Hi @jpalermo & community, we have implemented most of the new features and now have a few issues with the old features and updating the minor versions in general. I'm just asking myself the following questions:
@jpalermo Could you please have a look?
@jpalermo Could you please have a look again? I've added your recommendations.
Hey @ZPascal , was it your intention to exit 1
(which I think will terminate the entire script) from the function here:
and here:
rather than return 1
(which I think will simply signal the failure of the function)?
NB: My Bash knowledge is not great, but this part of the PR looked incorrect to me. It may be that I am the one who is incorrect.
Context: I'm currently trying to figure out why the CI job that runs the Acceptance Tests for this Release has been failing ever since this Release was released.
I don't know that this is actually the problem, but I was looking around, and this didn't look right, so I thought I'd ask.
Hey @ZPascal , was it your intention to
exit 1
(which I think will terminate the entire script) from the function here:and here:
rather than
return 1
(which I think will simply signal the failure of the function)?NB: My Bash knowledge is not great, but this part of the PR looked incorrect to me. It may be that I am the one who is incorrect.
Context: I'm currently trying to figure out why the CI job that runs the Acceptance Tests for this Release has been failing ever since this Release was released.
I don't know that this is actually the problem, but I was looking around, and this didn't look right, so I thought I'd ask.
Yes, it will terminate the script and I think that's the right thing to do. If don't stop the process, It will downgrade the database and start rewriting the configuration files. The tool used for minor and major version upgrades is pg_upgrade, which does not support version downgrades.
FYI: I discussed the issue with @jpalermo in a Slack discussion and already tested the effects. If you don't stop the downgrade, you end up with a crashed instance and have to do a lot of work to restore the instance.
Can you please share the CI job issue of the acceptance test on your end? I will take a look and maybe we can solve the problem together.
Oh, you work for SAP, so you probably have access to the Cloudfoundry Concourse instance... sorry about that, I should have checked before writing my question.
The most recent failing build (which fails in the same way as all the others) is here: https://bosh.ci.cloudfoundry.org/teams/main/pipelines/postgres-release/jobs/run-acceptance-tests/builds/96
The failure is very strange. The SSH attempts for both the "Validating the postgres-previous is not created" and "Validating the postgres-previous is created" tests in the upgrade_single_node_test.go
file fail. Despite what it looks like, both tests are failing, but the "... is not created" test doesn't check to see that the exit status from ssh
is NOT 255 (which indicates some sort of ssh
failure, rather than failure of the command ssh
ran), it just tests if the exit status was non-zero.
The build immediately before it succeeded. (https://bosh.ci.cloudfoundry.org/teams/main/pipelines/postgres-release/jobs/run-acceptance-tests/builds/95) For that build, I pinned back the postgres-release
resource to 5e330a77ae691e37da4585c5e404d7a8acede23b
, which is the commit just before Final Release 52 was released.
The release of Final Release 52 might be unrelated to the acceptance test failure... however, I do notice that part of the test involves an upgrade from the ~previous~ [most recent] Final Release to the code in the repo... which means that the commit that releases Final Release 52 was (at least for part of the test suite) the first time upgrading from the code in Release 52 was tested.
Oh, you work for SAP, so you probably have access to the Cloudfoundry Concourse instance... sorry about that, I should have checked before writing my question.
The most recent failing build (which fails in the same way as all the others) is here: https://bosh.ci.cloudfoundry.org/teams/main/pipelines/postgres-release/jobs/run-acceptance-tests/builds/96
@klakin-pivotal No, problem. I have access to the corresponding concourse instance, but not to the specific test pipeline itself. Do you know which profile or authorization is required?
The failure is very strange. The SSH attempts for both the "Validating the postgres-previous is not created" and "Validating the postgres-previous is created" tests in the upgrade_single_node_test.go file fail. Despite what it looks like, both tests are failing, but the "... is not created" test doesn't check to see that the exit status from ssh is NOT 255 (which indicates some sort of ssh failure, rather than failure of the command ssh ran), it just tests if the exit status was non-zero.
The build immediately before it succeeded. (https://bosh.ci.cloudfoundry.org/teams/main/pipelines/postgres-release/jobs/run-acceptance-tests/builds/95) For that build, I pinned back the postgres-release resource to 5e330a77ae691e37da4585c5e404d7a8acede23b, which is the commit just before Final Release 52 was released.
The release of Final Release 52 might be unrelated to the acceptance test failure... however, I do notice that part of the test involves an upgrade from the previous [most recent] Final Release to the code in the repo... which means that the commit that releases Final Release 52 was (at least for part of the test suite) the first time upgrading from the code in Release 52 was tested.
That sounds interesting and I would like to try it out. The structure of the test is particularly interesting for me.
fix: #74
TODO:
I'm currently working with @Jobsby together on the topic.