Add Q2 switch to exit on failed deploy

ar commented 1 year ago

A valid suggestion came from a jPOS user to enable Q2 to terminate upon encountering an error during deployment.

There are scenarios where it could be beneficial for Q2 to halt if a QBean fails to initialize. Nevertheless, we may want to distinguish between terminating during the startup process and ceasing operation if, for some reason, the system is running and a QBean becomes misconfigured.

I suggest to add a new startup option:

--exit-on-failed-deploy=startup|always

alcarraz commented 1 year ago

Maybe not all the qbeans are critical, another option (not saying better) could be that the qbeans could be flagged as critical, and maybe that could be a qbean attribute, e.g.:

<qbean name="xxx" exit-on-failed-deploy="[startup|always]">

alcarraz commented 1 year ago

By the command line, we could set the default value. And when not set, it could be no.

aVolpe commented 1 year ago

I was working in a feature similar to this (https://github.com/fintechworks/jPOS/commit/8aa75f75672a231a563bd3c5d5f80b62965f2994), I can continue that feature if we can define some behaviors:

The server should stop if we deploy an invalid descriptor when it is running?
The server should stop if the qbean throws on init or start?
How to handle async qbeans?

There is a related concept in kubernetes are readiness and liveness, see https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/.

For example, some qbeans should be required for the server to be 'alive' and others to be 'ready', for example, a server is ready if the txnmgr and the qserver beans are running, but it's alive only if the db connection is active.

We can use those concepts to model when to stop the Q2 after startup.

alcarraz commented 1 year ago

Maybe not all the qbeans are critical, another option (not saying better) could be that the qbeans could be flagged as critical, and maybe that could be a qbean attribute, e.g.:
<qbean name="xxx" exit-on-failed-deploy="[startup|always]">

Oups, my bad, just noticed that if the deploy fails because of bad XML this wouldn't work, we could handle cases where the failure is because of another thing, it probably isn't worth it.

ar commented 1 year ago

I understand your --ff option is very similar to what we are suggesting here, and I like it. There's still a difference between failing at startup versus always. I also liked @alcarraz suggestion and was disappointed, the same as you Andres when we realized that we could fail at the XML level.

I believe that the main concern that the user that raised this issue had is a situation where you start the system and the TM fails, so you have servers up, and no TM running.

I think the liveness/readyness status has to be addressed at another level, and may need cooperation from the QBeans, with clear indications to the system about their status. Probably something to start with a JEP discussion before implementing it.

aVolpe commented 1 year ago

The transaction manager loads all participants in the init method and extends QBeanSupport so no exception is thrown. But the init method doesn't mark the deployment as failed, it only logs a warning, so we need to mark this bean as failed or use the status -1 that the QBeanSupport uses to mark the bean as 'invalid'.

We can add these checks in the Q2 to determine if we need to fail:

In the catch as is in the ff pr to catch invalid xmls, any exception is marked as a fail.
If after the deploy, but before the start, the qbean is in an invalid state, fail.
After the start, check if the bean is a qbean, and the status is not STARTED, fail.

ar commented 1 year ago

I think what we are addressing here are situations where the TM would drop the QBean descriptor altogether renaming it as .BAD or .DUP or situations where the QBean raises a ConfigurationException.

jpos / jPOS

Add Q2 switch to exit on failed deploy #553