Closed dradX closed 3 years ago
Thanks @dradX for this enhancement description with all the details. First, I would try to break down this issue into a set of smaller issues which will be easier to gasps and will allow developers to frequently contribute and speedup the code reviews.
First, we need to pay attention to the complexity of the comparison. Probably we all assume, that both blueprints are be very similar and both have the same root node, in which cases this can be quite easy. Otherwise, xOpera must be able to abandon the job and issue a warning that this diff is to complex or it cannot guarantee that there exists only one unique transform.
Second, we need to support "targeted undeploy" ->an action that undeploys only specific nodes and relationships. Note that this functionality might be useful also for manual intervention on the deployed project (e.g. user would undeploy one node without a change in the corresponding blueprint (CSAR)). Which means that actual deployment would not reflect the terms stated in the CSAR. This is something that could be a bit strange as DI is not the same to the DB anymore. Should we handle this and how should we handle this?
Afterwards, I would suggest to proceed with small steps and constantly check if we support user stories and don't over-complicate.
What do you think, @anzoman , @sstanovnik ?
Hi @cankarm - thanks for your inputs and suggestions. I agree to proceed and split this in two separate issues targeting compare
and redeploy
command separately. As you correctly noticed most of the preparation for redeployment will be actually implemented in the compare
command, therefore it is also good to have a separate issue for it. We would like to point this two issues to this overarching issue to bring more context for the user, nevertheless.
With regards to the second part of the comments - regarding the abandoning of the comparison if the blueprints differ to much - the calculation of when to quit the comparisons is equally complex as calculating the diffs - since this can be done only after the complete traversal of two DAGs and calculating the diffs. Even more or equally hard, could be adding any rule about how much diff is to much. Therefore we plan to keep it really simple for now and just try to produce the diffs.
At this point we should also provide a checkpoint on redeploy
if any changes in the state of the deployed instance happened between issuing the compare
and redeploy
commands - such as changes created by adhering to TOSCA policy rules that might introduce state changes, for instance. In this case we would need to warn the user that state changed and the calculation is invalid.
As for the second part of the suggestion regarding partial changes - targeted undeploy - we would like to keep the workflow and commands (state) as consistent as possible with the current opera deploy
/undeploy
workflows, an therefore asking the user to produce a full new definition of the blueprint. We feel that this is also more coherent with the declarative TOSCA CSAR self-encapsulated blueprint approach.
We might address the partial undeploy option in a later separate issue having in mind the full consistency of this approach.
Anyhow I completely agree we should not over-complicate things and try to implement this variant and then introduce more complexity, if needed. Also any inputs from @matejart, @anzoman and @sstanovnik are more than welcome.
Let's go through the points
compare
and redeploy
I'm not sure that I get it. It seems as the redeploy should be able to detect if a new diff is not valid anymore?@cankarm Since we agree on the first point I will start with the second point.
Yes, you understood the issue, but I will try to explain a possible case again. When a user issues the compare
command opera executes diffs between the reconfigured set of inputs/blueprint and the current state of the model stored in .opera
. Since this state can change because of a policy triggered event or any other reason, before the user calls redeploy
, we should check if the state of the model is the same as when the user executed compare
. If the state of the model has changed we should warn the user that a new diff calculation is needed since state has changed.
What should happen on node delete is completely and entirely in the domain of the blueprint definition of the deployed instance and the same goes for nodes added in the reconfigured blueprint, if any. The orchestrator is not the one that should amend the or invent the process of reconfiguration. We can pose the same question about "what happens with this node on undeploy
". If the specific node in the blueprint has any configure interface this should be called as when executing delete. I will take the example you provided:
It's not simply delete the instance but it might include re-route the traffic or push all your data in a global database before you delete yourself
If a node in the deployed blueprint needs to store its data somewhere before deletion this should be part of the node configuration (interface operation of the node) in the blueprint - no mater, if this blueprint will ever be reconfigured or not. The same goes for the re-routing, for instance, when adding nodes to a load-balancer. It should be able to configure itself and account for the nodes added/deleted if they are properly added and configured in the blueprint. In this case we are also counting on the immutability of the executed playbooks, since there is nothing else opera can do. In any case if the user is not satisfied with the results of compare he/she can always abort redeployment by never calling redeploy
. We might also add a command to remove the .compare
directory in this case, if needed.
To be completely clear, we want to start with a simple redeployment and then check what can be done and to what degree, if we want to implement a smarter compare
.
To me, the original description makes sense. Personally, I'd have an opera diff
command, which outputs the difference between the state model and my updated service template + inputs. And then I'd have opera update
, which computes the diff (regardless of whether I issued opera diff
first or not) and applies it. I'm not sure if any additional files being stored in the deployment state are needed (sure, the internal representation of what it needs to do will be a derived thing, not a direct representation of the incoming service template), I would just treat the update
as an in-line partial undeploy and partial deploy. But admittedly, this is all an implementation detail.
And in general, if an instance of a node templates needs to change due to changed properties, the easier thing to do is to delete it and to create a new one. But perhaps there'll be a capability of doing in-place change for particular node types that are capable of doing that.
Just a nitpick, but a name "redeploy" suggests to me I either deploy everything from scratch, throwing away the existing deployment, or create a new instance. But to "update" it, it suggests a more targeted change.
@dradX probably we will need a call.
This change triggered by the policy is something that I would leave for the future. I would not focus on that now as it prematurely complicates things. First a diff and then an update. The reason is that you might be unable to make a change if your application would be in live-lock with constant scaling up and down or just moving some nodes somewhere.
I totally agree with you about the cover story. My issue here is how to tackle the lifecycle operations:
Where you will put your redeploy
(and I agree it should be update
)? Node undeploy can be delete, but in the case of update
, could have different workflow. We need to tackle this. I'm perfectly aware that this is on the user's side to create and put in the operation delete
, but we need to find a standard way to do that. You will need to use some parameter to pass around, so delete
will know if it is a simple destroy everything or just a scale-down.
@matejart thanks for this suggestions and expressed views. I would agree with you on almost all accounts - especially regrading the use of update
- it perfectly fits what we are trying to do instead of redeploy
. I would suggest we keep the compare
command as diff
is not a commonly used verb - although it is used often for this purpose. As you correctly noticed we would need to recalculate the changes once the user issues update
. The reason why I would really stick to the workflow diff
-> update
is because doing it this way we show the user what will be done if/when we apply update
. In this way the user explicitly (through the workflow) confirms the changes he/she wants to be applied to the deployed instance - and since there is no defined redeploy
workflow in TOSCA we can follow in this case we eliminate any questions with regards to what will be done when executing update
.
@cankarm we may have a call
--force
/-f
switch on update
. I prefer diff over compare as it seems more used - but I might be wrong. At least from my side we always had diff and update mentioned in issues. in the end, it might not be so important.
@dradX implement proposed things and please don't do it in bulk. That's all that I'm trying to achieve by not paying attention to policy triggers at making diffs and updates. And use a realistic example.
Thank you @cankarm and @matejart we will proceed with the implementation as agreed first by adding two separate issues covering commands diff
and update
and trying to take into account the maximum usability from the user perspective.
@dradX thanks for the detailed explanation regarding this feature. I agree on almost everything that was discussed before. So, just to sum up my thoughts:
opera diff
and opera update
seems cool to me, and as Matej stated, the update should do diff as well (we also wanted to have -v/--verbose
switch on every CLI command so with update this could print out the calculated diff.);diff
command should handle cases when user has not deployed anything, searching for the deployed CSAR in the opera root_file
;-o/--output
flag to output the calculated difference to a specified file (instead of printing it out to the console). How would this diff content be formatted. Probably JSON or YAML, right? If so, then we could also add --format
flag with output format options (json, yaml, etc.) so that the user will be able to get a suitable format (this was already implemented in opera outputs CLI command so it can be easily reused);.opera/comparison
would be ok and you need to make sure that it gets overwritten if user calls opera diff
multiple times, always showing the latest diff content;comparator
or diff_checker
) so that this part of code will be easy to reuse and maintain;opera undeploy
- something like --targeted-instances/--targets-list/--targets/-t
so the users can then specify just a list of nodes/relationships that will be targeted by the undeploy procedure. The cases when user would just want to stop some of his instances (instead of deleting them right away) are a little more complicated and linked to this issue https://github.com/xlab-si/xopera-opera/issues/62. But we can also skip this special "calling single interface operations" feature for now as I believe that this is not the most wanted thing.So, to conclude, I am looking forward to your comments and of course the upcoming issues (and PRs).
Thanks @anzoman
diff
and update
for nodes (and maybe relationships)diff
for empty deployment should probably return the contents of new blueprint as added nodes. update
command in this case should be equal to deploy
command.diff
command output should be a JSON or YAML, the same way it is done for output
command, yes. The exact structure of the output is not clear yet, I think it would be clarified later in the PRs.diff
command should not store anything after its execution and update
would basically run diff
one more time.@cankarm for the sake of having custom workflows for node updates we may later introduce a new TOSCA interface derived from tosca.interfaces.node.lifecycle.Standard
with a new update
operation. If this operation is present in the node, it is executed for updating, otherwise xOpera goes with stop
-> delete
-> create
-> configure
-> start
as @dradX mentioned.
One thing that would be nice to clarify is, what will be the input for opera update
. Probably it would be ok if it is a blueprint or in some cases, only a diff. At least intrinsically, opera will need only a diff to be instantiated.
All my fuss about the targeted undeploy is focusing on two steps:
delete
operation. But as the application would still live after you delete a particular node, you might want do have different delete as in case if you completely undeploy/terminate the application - will this be done by new operation or inside existing delete
but with a special a playbook? In the latter case, we need to have an exact example of how that playbook knows which delete this is.delete
operation - at least for some cases this would be needed, on others is already covered by the provider or the technology (e.g. for FaaS you don't need that). Orchestrator could do a random choice, but this might be unacceptable for some cases. For a default behaviour probably would be the best to put down the ones that were created the latest.Why I'm mentioning this? As we faced this update
problem already in case of scaling. For scale-up, the update
is executed as a set of create
operations, and currently, you can achieve this with clean state
+ deploy
and you are done. The scale-down is more problematic as it cannot be done by opera undeploy
with some tricks. This is an opera update
operation that will execute some delete
operations and also some create
operations and occasionally also some reconfigures to fix what might have been ruined with the delete.
The reason, why we used the term targeted undeploy is because we required to name something that has similar undeploy capabilities as clean state
+ deploy
for deploy.
@cankarm I think the input for update
would be always blueprint + inputs, as diff only would not be enough in most cases. Opera would need to instantiate not only the nodes that are changed but also the nodes that stay intact, but serve as hosts for changed nodes. Basically the idea is to instantiate the whole blueprint graph, traverse it and skip the nodes that do not require any action.
@alexmaslenn, but if the node has to be changed, then it is also described in diff. Am I right?
@cankarm imagine the scenario, when we have a VM with Docker engine and one Docker container as DI1 blueprint. DI2 blueprint simply introduces another Docker container to the existing instance model. So the diff
result would be something like:
added:
- container2
But in order to properly deploy the new model, opera would need information about the VM and Docker engine, hence instantiate them in update
process.
@alexmaslenn are we talking about the text diff or diff of two blueprints? I assume that diff would actually include all the nodes that has been touched, not just the lines. I understand your point, but I started from a different angle. So I thought that in diff you would omit only the parts that there is no change at all in the whole "tree". As this would be an input for an orchestrator to do something.
@cankarm then it is not clear for me, how the diff would look like, so it can both serve as an input for opera update
command and look like a meaningful diff.
@alexmaslenn - me neither. I understand that is very convenient that you see only the changes and this must be done for visual presentation, but I'm not sure if it is enough for execution. For example, in the case of removing nodes, you will provide a new blueprint with less content. How you will determine which nodes will be removed? You will need to create an "executable diff" and then execute it. Or there is another way?
With an update process of removing nodes, it might also be the case that you would need (or be able) to add a custom delete
implementation (remove operation)* that might resolve some of our delete problems above, but this should be then done with adding some additional input, not only a new blueprint.
@cankarm well the idea is that diff
command would produce 2 separate results:
update
operationWhen opera diff
command is executed, the first result would be the output and the second one basically goes nowhere. When opera update
is executed the second result serves as model for applying the desired changes, and the first one can be logged to console in --verbose
mode.
For the example discussed above the first would be something like:
diff:
added:
- container2
changed:
deleted:
and the second one would consist if 2 graphs (obviously with more info, types, properties, attributes, etc):
nodes_undeploy:
vm1:
state: initial #would not be undeployed
docker_engine1:
state: initial #would not be undeployed
host: vm1
container1:
state: initial #would not be undeployed
host: docker_engine1
nodes_deploy:
vm1:
state: started #would not be deployed
docker_engine1:
state: started #would not be deployed
host: vm1
container1:
state: started #would not be deployed
host: docker_engine1
container2:
state: initial #would be deployed
host: docker_engine1
'started' and 'initial' are internal xOpera node instance states that indicate whether this node is already deployed and undeployed respectively link Opera would use the second representation to instantiate graphs and proceed as following:
opera undeploy
for the first graph (in the example above would do nothing)opera deploy
for the second graph (in the example above would deploy container2)This is definitely not the final design but rather the concept to start with.
UPD: Made clarifications for second representation because it was confusing as @cankarm rightly noted
@alexmaslenn thanks for that explanation. So the result is that you will not touch anything from the first graph (as the initial
would be ignored for undeploy) and in the second one you would ignore all started
and deploy only initial
as this is the way the deploy works.
It seems promising and I like it except the confusion that can result from the state names.
@alexmaslenn I put together a diagram, that might help to understand all steps. Please review it and comment. We can update with your ideas.
@cankarm the implementation of this enhancement would be based on functionality that xOpera currently posses. As far as I'm aware there is no part of xOpera that allows monitor or get feedback from infrastructure once it is deployed. So Q2 answer is - diffs are made towards internal representation of the state as it is the only info currently available to xOpera.
As for Q1, if Policy Trigger makes changes in internal representation as shown on the diagram, then these changes would be reflected in diff
operation with Day 1 blueprint.
Now we have the first version of opera diff
(introduced with #147) The dev opera release that includes this new CLI command is already available on Test PyPI instance.
Good news - both opera diff
and opera update
commands are now available within the latest opera pre-release on Test PyPI here: https://test.pypi.org/project/opera/0.6.4.dev8/.
Great!
If anyone can provide the documentation and example, that will be nice. A separate branch and then we merge it to main-docs branch.
The documentation for both commands will be added with #172. I think that we realized all the plans considering this loong issue so I'm closing it now.
Description
In this issue we describe a proposal for enabling opera to compare the existing Deployed Instance model (DI1) with a changed/reconfigured Deployable Instance model (DI2) defined with a new version of the blueprint (DB2) and supplied set of inputs I2 where DI2=calcdeploy(DB2,I2).
Assumptions - A deployed instance is running and the instance model is stored in the
.opera
directory enabling the user to undeploy the currently running deployed instance withopera undeploy
. User story - However the user does not want to undeploy the whole deployed instance but rather patch/reconfigure the existing Deployed Instance (DI1) with a changed set of blueprint and inputs. Before applying changes to the Deployed Instance the user wants to calculate differences between the existing Deployed Instance (DI1) and the target Deployable Instance (DI2). Introduction ofcompare
command:opera compare templ-v2.csar -i input-v2.yaml
Opera calculates the internal set of topology changes needed to satisfy the desired reconfiguration Diff=compare(DI1,DI2) and outputs a list of changes between the existing deployed instance (DI1) and the supplied changed deployable instance (DI2). If the user is satisfied with the calculated differences he/she will be able to confirm and execute the calculated changes by issuing a new command for executing reconfiguration DI2=redeploy(Diff). Introduction ofredeploy
command:opera redeploy
Opera executes the internal set of calculated changes of the topology and saves the the results of execution in.opera
folder.Steps
Implementation of compare The user provides a new desired definition of the deployed instance through a set of changed blueprint and inputs and issues the
compare
command:opera compare templ-v2.csar -i input-v2.yaml
Opera creates a desired Deployment Execution Graph (DEG2) from thetempl-v2.csar
andinput.yaml
. Opera instantiates the existing Deployment Execution Graph (DEG1 - as done inundeploy
) and starts executing the comparison of the two graph's nodes and edges. The nodes/edges in DEG2 can be either unchanged/changed/added/deleted with respect to DEG1. For every node in DEG2compare
tries to find the node with the same node name in DEG1:For the sake of understanding we introduce the concept of a node in changed group/tag changed_deleted/changed_added wich represents the node version:
In the process of grouping/tagging all nodes from DEG1/DEG2 we can set up the preparation of the two graphs Deploy Execution Graph Delete DEGD and Deploy Execution Graph Add DEGA. :
Both DEGD and DEGA will be created and stored in
.opera/.comparison
directory as done during thedeploy
command. The DEGD will hold the deployment graph based on DEG1 (Day1 execution) so unchanged nodes/edges will be marked with initial state thus eliminating the need to undeploy them again, while the changed-deleted/deleted will be marked with started state so opera should delete them on undeploy. The same inverse concept of desired state will be used for nodes/edges in the DEGA deployment graph (Day2 execution) - unchanged nodes/edges will be marked with started state so opera should skip the deployment of them, while the changed-added/added nodes/edges will be marked as initial so opera should add them on deploy.Compare Output - output can be written to a file using the
-f
or--file-output
switch - a copy of the output will be stored in the.comparison
directory asoutput.txt
it will produce a "diff" like output of the topology with marked deletions and additions.opera compare templ-v2.csar -i input-v2.yaml -f ./outputs/myoutput.txt
We would need to implement the new command in
/src/opera/commands/
folder.Implementation of redeploy The user might agree with the suggested comparison by inspecting outputs or not. The user can approve the suggested reconfiguration of the existing deployment by inspecting the "output.txt" and confirm the redeployment with a separate
redeploy
command. Since the comparison data and execution graphs are already stored in.comparison
directory the user should be able to redeploy the wanted changes by issuing just:opera redeploy
Opera would take the data stored in.comparison
Deployment Execution Graph Delete (DEGD) and Deployment Execution Graph Add (practically built from the DEG2 with state of the nodes applied) and execute this order of commandsAfter this the state of the redeployed instance would be stored in
.opera
as done after the execution ofdeploy
command.Current behaviour
Currently, opera does not provide a way to patch/reconfigure an existing deployed instance without undeploying the topology as a whole. Using the
opera deploy
with-f
or--force
a partial redeployment could be done but no nodes can be removed from the existing deployment instance and the update of the nodes relies only on the immutability of executor script/(ansible playbook) implementation.Expected results
The user is able to patch/reconfigure an existing deployed instance without having to undeploy the whole running instance by:
mycsar-v2.csar
,input-v2.yaml
),opera compare mycsar-v2.csar -i input-v2.yaml
for the calculation of Diffs stored in.opera/.compare
and presented to the user as output,opera redeploy
if the Diffs presented in output satisfy the expected topology changes to execute the deployed instance reconfiguration.The results of the executed reconfiguration are stored in
.opera