crim-ca / weaver

Weaver: Workflow Execution Management Service (EMS); Application, Deployment and Execution Service (ADES); OGC API - Processes; WPS; CWL Application Package
https://pavics-weaver.readthedocs.io
Apache License 2.0
24 stars 6 forks source link

Weaver incompatible with PAVICS Autodeploy mechanism #508

Closed tlvu closed 1 year ago

tlvu commented 1 year ago

Describe the bug

The Autodeploy runs in it's own docker image. The runtime environment is meant to be kept to the bare minimun.

Weaver post-docker-compose-up needs curl which is not available in the Autodeploy environment.

An easy solution is to separate the various curl steps out to another script and volume mount that script inside the Weaver container, then docker exec into the Weaver container to execute the various curl commands.

Basically, reproduce exactly what has already been done for celery-healthcheck.

Here celery-healthcheck is being volume-mount into the Weaver container: https://github.com/bird-house/birdhouse-deploy/blob/6bdb8bff9075415f86f8c9ab4e847808dcad0f07/birdhouse/components/weaver/docker-compose-extra.yml#L68

Then docker exec is used in post-docker-compose-up to run celery-healthcheck inside the Weaver container: https://github.com/bird-house/birdhouse-deploy/blob/6bdb8bff9075415f86f8c9ab4e847808dcad0f07/birdhouse/components/weaver/post-docker-compose-up#L294-L297

FYI @huard

github-actions[bot] commented 1 year ago

Thanks for submitting an issue. Make sure you have checked through existing/resolved issues to avoid duplicates. Also, make sure you provide enough details for us to be able to replicate and understand the problem.

fmigneault commented 1 year ago

Where is the autodeploy defined? Is there any limitation to installing curl to fulfill the missing dependency? It seems like curl is used in other deployment strategies (i.e..: https://github.com/bird-house/birdhouse-deploy/blob/6bdb8bff9075415f86f8c9ab4e847808dcad0f07/birdhouse/vagrant-utils/install-docker.sh#L2). Why should the environment be different in autodeploy's case?

I don't think it is the role of Weaver to install curl to fulfill the requirement of the platform.

tlvu commented 1 year ago

I understand your thinking curl is a simple dependency and should have been available.

However, PAVICS is meant to be pluggable and extensible. Any external components in any external repos can hook into the platform.

So the platform can not just keep adding new dependencies for any pre/post scripts for any possible extra components.

Each component have to bring along all the runtime it needs.

It's like Jenkins. Each job bring along it's own runtime environment so Jenkins can run any jobs without the need to be prepared in advance.

tlvu commented 1 year ago

Thinking about this, maybe you don't need to add curl to Weaver. You can just use any docker image providing curl and run the portion needing curl in there.

In the future if you need any new tools, you do not need to touch the Autodeploy env and you do not even need to install it on the docker host running PAVICS. You bring along your own docker env for all the tools you need.

It's like Weaver needs Mongodb but you do not need the docker host to have Mongodb physically installed. You bring your own Mongodb and you even have the control of its version.

fmigneault commented 1 year ago

Each component have to bring along all the runtime it needs.

That's the thing though, curl is not a requirement of Weaver. It is only used by the post script part of birdhouse. It makes even less sense for it to install dependencies of every possible parent stack that what to make use of it. Even if we mounted post-docker-compose-up in Weaver's Docker, you would have the same error as it is not installed (and in this case specifically, on purpose to avoid tempering with the image contents).

❯ docker run -ti pavics/weaver:latest-manager curl
docker: Error response from daemon: failed to create shim: OCI runtime create failed: runc create failed: unable to start container process: exec: "curl": executable file not found in $PATH: unknown.

Thinking about this, maybe you don't need to add curl to Weaver. You can just use any docker image providing curl and run the portion needing curl in there. In the future if you need any new tools, you do not need to touch the Autodeploy env and you do not even need to install it on the docker host running PAVICS. You bring along your own docker env for all the tools you need.

I like the idea. Maybe simply adding docker run --pull missing curlimages/curl curl [...] instead of each curl [...] call would be enough to patch the problem. Maybe some additional mappings will be needed because I'm not sure if the targeted endpoints will be visible from within that container.

I am honestly surprised that the machine running the autodeploy does not have curl. What does it use instead to fetch remote changes to know what to deploy? Like you mentioned, I employed curl because I assumed it was common enough, but we could replace it by python with requests, urllib or any other "ping" utility that can do basic HTTP requests.