Opera init CLI command may be unncecessary

anzoman commented 4 years ago

Description

This issue is meant for the discussion about opera init command which might be uneeded.

The latest version 0.6.1 of opera TOSCA orchestrator provides the following CLI commands:

usage: opera [-h] {deploy,init,outputs,undeploy,validate} ...

opera orchestrator

positional arguments:
  {deploy,init,outputs,undeploy,validate}
    deploy              Deploy service template from CSAR
    init                Initialize the deployment environment for the service template or CSAR
    outputs             Retrieve service template outputs
    undeploy            Undeploy service template
    validate            Validate service template from CSAR

optional arguments:
  -h, --help            show this help message and exit

When user wants to deploy a (compressed) CSAR or a a logical way to do it with opera would feature using the commands below.

# initialize the CSAR (from the prepared compressed CSAR file, also provide inputs)
opera init --inputs inputs.yaml test.csar

# deploy the initialized CSAR
opera deploy

This is all ok. Opera init checks the CSAR structure and prepares root_file and input files in opera storage and deploy initiates the deployment.

However, when you deploy a service template you can use init or you can deploy directly from the service template using opera deploy service.yaml. So init here is a totally unnecessary step. The same could apply for the zipped CSARs. Basically with init opera doesn't do anything special. It just checks the CSAR structure, which is done by a separated CSAR class that can be imported anywhere. So I thought why would we complicate our lives? We could move the CSAR structure check to deploy CLI command and have opera deploy for everything. Thereby user wouldn't need to do one additional step with init and he would just use opera deploy my.csar to deploy a compressed CSAR. It has also been evident that many users were confused with init because they completely forgot it when deploying their pre-packed CSARs, because they were used to do just opera deploy with all their extracted CSARs and TOSCA service templates.

I can sense that someone might think that opera init can be good when we want to distinguish compressed and extracted CSARs. I belive that init can be useful if we had some additional options for CSARs that cannot be done before/at the start of the deployment and would consume a lot of time, but currently I don't see that coming. One additional good point to throw init away would be the opera validate command we have which could and will be redesigned to validate not just TOSCA service template, but also compressed CSARs. So this will also be a place, where we will use our CSAR validator class and it would then be completely unnecessary to have this in init.

Steps

By removing opera init we would have to move CSAR validation to opera deploy command.

Current behaviour

Right now opera has opera init command to initialize and prepare compressed CSAR for the deployment.

Expected results

To be able to use opera deploy to deploy both extracted and compressed TOSCA CSARs.

anzoman commented 4 years ago

@cankarm, @sstanovnik, @dradx, @alexmaslenn, @matejart, @tadeboro it would be great if you express you thoughts on this.

cankarm commented 4 years ago

Just two thoughts, so we will not run in this change prematurely:

Would not be better to have a validate command separately, istead of puting it together with init or deploy? I see the point to have it together with deploy anyway, but using it separately would have some sense to
With this change, you will need to add inputs (if they exists) always inline? What I liked with init was that I setup runing parametres (e,g, which file input file is needed) and afterwards only deploy was really needed. So I do like swithches but also I don't mind the comodity of the "init". Sure, I don't need init command for this. A config file that would be used to define "prefered switches for my deploy would be enoug"

anzoman commented 4 years ago

@cankarm thanks for your thoughts.

we already have a completely separated validate command which only works for TOSCA service templates and extracted CSARs. So now you can do opera validate -i inputs.yaml service.yaml. With some minor changes of the opera commands that I'm preparing right now, it will also be possible to validate a compressed CSAR with opera validate -i inputs.yaml test.csar.
yes, now you can supply inputs when you run opera init or later when you run opera deploy and without init you would have to supply them only once when you would run opera deploy

dradX commented 4 years ago

@anzoman Thank you for this initiative regarding the usage of init command.

I tried to point out the two issues I had with init command before implementation:

taking inputs before deploy/undeploy - for standard usage this is not needed, as the inputs should be supplied in JIT manner when needed by the orchestrator ie. on deploy/undeploy
making the init command as a "standard" entry point for CSAR extraction / where name init does not imply unpacking CSAR

See previous comments implementation suggestion and init questions.

I agree that CSAR is just another way of representing the service template (zipped format) and support the separate validate command that should take either CSAR or service-template.yaml (with directory structure). The same goes for the deploy CSAR - I guess that in this case we would call deploy with:

opera deploy --inputs inputs.yaml --instance-path ./test-dir test.csar

This implementation would still be inline with the simple deploy/undeploy workflow and clearly set the usage of inputs.

anzoman commented 4 years ago

Thanks for explaining your view @dradX. I agree and IMHO init can be kept only if we have some clear plans to use it and if we would require a special step before the deployment. But as things are now, it seems that we don't need it at all.

sstanovnik commented 4 years ago

I also think init is unnecessary, because it doesn't really do anything useful. Having multiple ways of doing the same thing is confusing to the user and difficult to implement correctly - see the nested .opera/ issue we had.

My vote goes to:

removing init
also making sure the whole application behaves consistently and isn't using the current working directory anywhere implicitly
unzipping the CSAR into a temporary directory, e.g. in /tmp/, but not in .opera/
- optionally delete the thing after execution (only on success maybe?)
not using implicit inputs stored in .opera/inputs - I don't really even like storing them, but there might be something I'm not seeing
as far as I can tell we need to change deploy slightly: from opera deploy <csar|service-template> to both
- opera deploy service-template
- opera deploy <csar> [service-template]
- this is because you can now deploy a csar which may or may not have a default entrypoint that you may or may not want to override

As a final result of what user interaction should look like, from my pov - the following should all be equivalent and should result in files generated in the same locations, except for the extracted csar:

# assumption: the csar contains a default entrypoint

# 1) doing things manually and implicitly
unzip test.csar
opera deploy -i my-inputs.yml

# 2) doing things manually and explicitly (not that cwd is never used)
unzip test.csar -d path/to/extracted/csar/
cp /home/me/my-inputs.yml /srv/my-inputs.yml
cd /usr/share/dnsmasq-base/
opera deploy -i /srv/my-inputs.yml --instance-path /srv/.opera/ path/to/extracted/csar/my-st.yml

# 3) doing everything automatically, autoextract to somewhere in /tmp/
opera deploy -i my-inputs.yml test.csar

cankarm commented 4 years ago

A few comments to @sstanovnik :

Unziping to /tmp/ and deleting:
- I do like the idea, I also think that you might require to delete after execution, so you don't have problems with deployment of same CSAR as a different project that might exist in parallel on your machine. But here is also a question, can you delete this if the application is still running? It seems that might be beneficial for Opera to have a copy of what was the initial CSAR, but this does not mean it should be unpacked, sure.
opera deploy example.csar example_service_template.yaml would do what? Override entry point?

sstanovnik commented 4 years ago

There wouldn't be a problem with deploying the same CSAR as a new temporary directory would be generated, not based on the CSAR's name or contants at all. This can be deleted at any point because the packed CSAR still exists and it is redundant.
Yes.

cankarm commented 4 years ago

... and you would need to store this state (temp dir location) somewhere in .opera. That's OK.

sstanovnik commented 4 years ago

No, you wouldn't need to, it's not state. The extracted directory is ephemeral because it only needs to exist during a single command, so no need to memorise the location. If you don't want to extract the CSAR every time for optimisation purposes (but how often do you even do that? twice?), you could store the location. However, if you're that conscious about extracting something into tmpfs multiple times, that's where manually unzipping comes into play I think.

cankarm commented 4 years ago

Then all the debate of where to unpack is useless, if you clean it each time after use. Still think that on some step we will have application update rollout, updating a small piece of software, CSARs will have versions etc, where a CSAR which provides the resulted deployment would be nice to have set in stone. Packed or unpacked, does not matter at all.

matejart commented 4 years ago

I was one of the initial proponents of the init command, and I believe I provided ample arguments in favour of using it. Considering that we've already gone back and forth on this topic ad nauseam, your use cases must be different from the ones I proposed, so please feel free to adjust as needed.

dradX commented 4 years ago

@sstanovnik I agree with the suggested approach and I sense that @cankarm is addressing another important issue that has not been addressed yet in my opinion: 1.a CSAR identification and blueprint snapshots (versioning) of the deployed instances 1.b Deployment instance identification (CSAR+inputs).

Problem

Consider using the same CSAR with a different set o inputs (my_inputs.yaml and my_inputs2.yaml with different content) to provide a new deployment instance.
Consider having multiple users (or same user doing the deployment in two terminals) deploying the same (CSAR+inputs)
Consider updating the deployed application (CSAR - blueprint) with a new version of blueprint (CSAR) and inputs (@cankarm - CSAR "set in stone") (reconfiguration)

Instance identification is needed in all of these cases.

Assumptions - inputs are never stored in zipped CSAR.

Solution proposal Maybe we can solve all of these issues using MD5 file-hashes to uniquely identify zipped/unzipped CSAR and by exposing internal .root directory for instances deployed through opera deploy - currently .opera. By exposing this directory as ENV var ROOT_INSTANCES, for example, we can probably provide this kind of functionality even for (API/SaaS) calls by setting the same root directory. Additionally opera can save the deployed CSAR snapshot (in zip format) along with other currently created files/directories in .opera thus enabling the user with a complete retrospective on the deployed instance.

Process

user runs: opera deploy -i my-inputs.yaml test.csar
- opera extracts CSAR test.csarto internal tmp .tmp UUID calculated - lets call it .tmp/UUID
- opera copies my-inputs.yaml to .tmp/UUID
- opera calculates the Hash of the directory contents -> DirHash (sample here) - we identify DirHash as a unique deployment instance ID (CSAR+inputs)
- opera uses DirHash in .root to setup a unique directory name and uses this internally as --instance-path
- opera checks if the deployment .root/DirHash already exists - returning error on create directory if it exists (unique deployment instance)
- opera creates .root/DirHash sucessfully and then deploys the blueprint stored in .tmp/UUID using my-inputs.yaml
- opera creates another file deployment-id (along with current set of files/directories created on deploy eg. instances, root, inputs) - deployment-id stores DirHash so the user can retrieve the id of the instance after deployment.
- opera creates another file CSAR (along with current set of files and directories created on deploy eg. instances, root, inputs) - CSAR is the snapshot of the deployed blueprint in zipped format using .tmp/UUID as root zip directory
- opera removes the temporary directory .tmp/UUID
- opera creates .opera symlink in the $PWD to link .root/DirHash directory
- user can access the data in .opera as in the current implementation through the symlink.

doing things manually and implicitly - the process is practically the same when we deploy from unzipped CSAR

unzip test.csar
opera deploy -i my-inputs.yml

since the CSAR file is not provided we use current directory $PWD for calculating DirHash
opera uses DirHash in .root to setup a unique directory name and uses this internally as --instance-path
opera checks if the deployment .root/DirHash already exists - returning error on create directory if it exists (unique deployment instance)
opera creates .root/DirHash successfully and then deploys the blueprint stored in $PWD using my-inputs.yaml
opera creates another file deployment-id (along with current set of files and directories created on deploy eg. instances, root, inputs) - deployment-id with the data DirHash so the user can retrieve the id of the instance after deployment.
opera creates another file CSAR (along with current set of files and directories created on deploy eg. instances, root, inputs) - CSAR is the snapshot of the deployed blueprint in zipped format using $PWD as root zip directory.
opera creates .opera symlink in the $PWD to link .root/DirHash directory
user can access the data in .opera as in the current implementation through the symlink.

PS: I am aware that using file-hashes can be a bit problematic (empty spaces, special characters (CR/LF/..), etc.) but still think that the pros override the cons of such a solution.

With this approach we:

add CSAR snapshot to the deployed instance folder
can check if the concrete instance has been already deployed (DirHash being unique ID)
can always reference the deployed instance by its unique ID for possible subsequent operations (for instance reconfiguration)

anzoman commented 4 years ago

@dradX many thanks for your looong comment. I think that we will go step-by-step in the direction you proposed.

I will dare to rekindle this issue with the #123 PR, which will mark opera init as deprecated (this way we won't have to remove it to soon as some use case might occur that could prove opera init useful) and will allow the deployment of compressed CSARs with opera deploy.

sstanovnik commented 3 years ago

Closed with #123.

xlab-si / xopera-opera