arobirosa / areco-deployment-script-manager

Simple but powerful Patch system for SAP Commerce Cloud (hybris)
Apache License 2.0
12 stars 5 forks source link

Deployment with migrate data fails after migration to the latest version #33

Closed miwoe closed 10 months ago

miwoe commented 1 year ago

Describe the bug If we try to migrate to the latest version, we face issues with the deployment to the SCCv2. Every deployment with "migrate data" fails. According to the logs, everything seems to be fine, but migrations ends with type system update failed. The last message in the log is

"Is the Areco Deployment Manager extension turned on in the tenant <>:true" Then next message: «shutting down hybris registry..

If I deploy without migration and then update the typesystem via HAC, it works, but also every further deployment with migate data fails.

I can also confirm that deployment with "Migrate data" works again, if we would fall back to the older version. So, there is definitely an issue with the new areco code.

To Reproduce Steps to reproduce the behavior:

  1. Update Areco extensions to the latest version.
  2. Deploy to SCCv2 with migrate data

Expected behavior Deployments with Migrate Data work

Log entries "Is the Areco Deployment Manager extension turned on in the tenant <>:true" Then next message: «shutting down hybris registry..

Screenshots

image
arobirosa commented 1 year ago

Hi @miwoe ,

thanks for reporting this bug! Unfortunately I analyzed the code and I can only guess that something with the configuration import of the Areco Deployment Manager is failing. But we need to activate the logging level debug for the class org.areco.ecommerce.deploymentscripts.core.impl.ImpexArecoInitialConfigurationImporter to be sure.

Let's have a call together where we see the behavior of the deployment manager on the SAP cloud.

Cheers!

miwoe commented 1 year ago

HI Antonio,

in the meantime I have realized the existence of the buildcallbacks.xml in the arecodeploymentscriptsbackoffice which is shutting down Hybris if the last script was successful.

That is also the problem. The deployment in the cloud is performing a "Migrate data" without performing any project data => without areco scripts, but of course it is executing the ant buildcallbacks. Therfore, although the deployment hasn't triggered the areco script executition, the buildcallback is checking if the last execution was successful. This could be an execution which was executed hours or days ago.

For us, we will prefer to remove the validation in the buildcallbacks, because we execute the areco scripts only via HAC. In a EnPremise environment, it might be useful, but in the CCv2 the validation ever validates the previous deployment and not the current deployment.

BTW. the new features will be very helpful.

BR, Michael

arobirosa commented 1 year ago

Hi @miwoe ,

your last comment is very valuable! Yes, I did some changes regarding the start of the deployment scripts inside the junit tenant and this might be the cause of the error on the SAP cloud. And I don't like that the buildcallbacks are shutting down the tomcat server. I will try during the weekend to understand where is the error.

I also need to see how to check if the last deployment script failed without using callbacks.

I will comment here, as soon as I have news.

Thanks a lot!

arobirosa commented 1 year ago

Hi @miwoe ,

is the junit tenant configured on your project? When you go to HAC, are there one or two tenants on DEV on the SAP cloud?

Thanks!

miwoe commented 1 year ago

Hi Antonio,

sry for the late answer. Only master tenant is listed in the HAC.

BR, Michael

miwoe commented 1 year ago

Hey Antonio,

we had also some issue about inital-configuration.impex. The data there should be tackled by essential-data. Then migration would be possible without any manual steps in between.

arobirosa commented 1 year ago

@miwoe informed that they are not using the junit tenant. And I know the project, there are no working integration tests.

I will start today comparing the build call backs before and after my changes.

arobirosa commented 1 year ago

There wasn't any changes on the buildcallback on version 1.5.4 but the handling of the code recognizing if the latest areco script has failed was improved.

The callbacks are currently required by the ant standalone commands.

I need to further check what happens if the inital-configuration.impex fails. Because in my current project we aren't using the areco deployment manager, I can't test with the SAP cloud.

trilokkatni commented 1 year ago

Logic in checkIfTheDeploymentScriptsWereSuccessful trying to call shutdown() if there is any error in script result which doesn't make sense to me. I think we should make a configurable option if someone wants to fail deployment or not. In bigger projects chances of failing script due to data are high and because of that stopping the whole deployment seems to be a issue. Please add to the repo and I will try to provide a fix for it. boolean wasLastScriptSuccessful = starter.wasLastScriptSuccessful(); de.hybris.platform.util.RedeployUtilities.shutdown(); System.exit(wasLastScriptSuccessful ? 0 : 1);

arobirosa commented 1 year ago

Hi @miwoe,

I reduced the priority of this bug because I got a confirmation that the latest version is working correctly in another project using SAP cloud. Mo, head of your devop team, also confirmed that he don't know of any issue with the deployment manager.

To find out fast, why the deployment is failing, I would like to see the tables after the failure. Please send me an email to arrange a meeting next week if you are available. This week I am on holidays and I don't have time.

Anyway, I would like to replace the System.exit() with a softer way to send a signal that the deployment has fails as it happens when a test fails.

arobirosa commented 1 year ago

Logic in checkIfTheDeploymentScriptsWereSuccessful trying to call shutdown() if there is any error in script result which doesn't make sense to me. I think we should make a configurable option if someone wants to fail deployment or not. In bigger projects chances of failing script due to data are high and because of that stopping the whole deployment seems to be a issue. Please add to the repo and I will try to provide a fix for it. boolean wasLastScriptSuccessful = starter.wasLastScriptSuccessful(); de.hybris.platform.util.RedeployUtilities.shutdown(); System.exit(wasLastScriptSuccessful ? 0 : 1);

Hi @trilokkatni ,

thanks for your contribution!

In those projects, I don't recommend the use of the Areco Deployment Manager: "The customer and the project manager must see the correction of these issues as an investment in the stability and the prevention of bugs in production. If this condition is not met, the Areco Deployment Manager extension can't be used in your project.".

The above code stops the ant builds after the update running system and it is required to make the developers aware that some script failed when they use "ant updatesystem".

If you want to ignore errors in the areco scripts during the deployment use the deployment manager, please create a feature request explaining how it would improve the development process. Feel free to clone this repository and prepare a pull request if you want.

Cheers!

arobirosa commented 1 year ago

Now that I am not working in any hybris project, I have time to work on this ticket.

I just reviewed my changes from the last release and I found that any failure to import the states of the scripts won't be detected and the code will always fail later when checking if the areco deployment scripts were ran. This could be the original cause of the error reported on this ticket.

I would like to replace the call to System.exit() with something softer or make configurable if the deployment is stopped when running on the cloud.

Finally I see that a second log is created for the execution of deployment scripts. This won't work on the SAP cloud.

I will do these changes.

arobirosa commented 1 year ago

Hi @miwoe ,

I implemented your proposal and just did the following changes:

  1. Now if the configuration property "deploymentscripts.stopantonerror" is false, the ant build process won't fail if there was an error running a deployment manager
  2. Removal of all calls to System.exit() on the build callbacks. Now exceptions are thrown which is the softer method which I found
  3. If there is an error importing the initial configuration required by the Areco Deployment Manager, a clear error message will be displayed on HAC and the console. I didn't move this configuration to "create essential data" because this won't simplify my code
  4. Removal of the log ${HYBRIS_LOG_DIR}/deploymentScripts.log because it is not accessible in the SAP cloud

Now I will finish testing these changes, implement 2 other issues and prepare a release for you.

Cheers!

arobirosa commented 1 year ago

Hi @miwoe ,

if you agree with these changes, could you please close this ticket. We can also wait for you to test the new release 1.5.5.

Cheers!

arobirosa commented 10 months ago

Hi @miwoe ,

I will close this ticket which was solved and released in November.

If you prefer to keep it open, please let me know.

Cheers!