CLARIAH / clariah-plus

This is the project planning repository for the CLARIAH-PLUS project. It groups all technical documents and discussions pertaining to CLARIAH-PLUS in a central place and should facilitate findability, transparency and project planning, for the project as a whole.
9 stars 6 forks source link

Comments on CLaaS architecture #3

Closed ddeboer closed 2 years ago

ddeboer commented 3 years ago

@proycon Thanks for fec78b62668caa24c97aadd0b25832ef686afece. Some things that come to mind when looking at your architecture overview:

Deployed docker containers; or container registries storing the container images. (blue boxes)

In this stage, the containers haven’t yet been deployed. Do you mean built Docker containers instead? Let’s make this a bit clearer: software components (light green boxes) must push their build output as one or more Docker containers into a container registry. Merely adding a Dockerfile to software component source code is not sufficient, because CLaaS will not build the containers.

An application deployment configuration (Infrastructure as Code)

This may be stored in the same repository that holds the software component itself.

Data Store (yellow cilinder)

Do you mean a data store for managing the infrastructure or a data store that is used by the applications, such as a database?

App Interfaces

They now only run containers. Is that sufficient for all use cases, particularly HPC? Using a proper infrastructure orchestration tool, CLaaS could both deploy containers (preferred) and provision VMs (if necessary).

proycon commented 3 years ago

Thanks for the feedback!

In this stage, the containers haven’t yet been deployed. Do you mean built Docker containers instead? Let’s make this a bit clearer: software components (light green boxes) must push their build output as one or more Docker containers into a container registry. Merely adding a Dockerfile to software component source code is not sufficient, because CLaaS will not build the containers.

Yes, this refers to built containers indeed, as obtained from the registries. The actual Dockerfile is provided in the green layer in the bottom.

This may be stored in the same repository that holds the software component itself.

Agreed

Do you mean a data store for managing the infrastructure or a data store that is used by the applications, such as a database?

The latter, data stores such as databases, or whatever mounted data volume the containers have at their disposition.

They now only run containers. Is that sufficient for all use cases, particularly HPC? Using a proper infrastructure orchestration tool, CLaaS could both deploy containers (preferred) and provision VMs (if necessary).

I can't really judge that well. For certain High Performance Cluster use cases I guess you'd want to stick to the lower levels of running the actual software in some distributed fashion over the cluster, without unnecessary service overhead, but even then containers will provide a good solution (although there you'd rather want Singularity rather than Docker I'd say). If we can try to limit ourselves to at least the simplest common solution (containerisation using docker), then that might make things easier to adopt for developers then if we try to target too much (like what I did in LaMachine).

proycon commented 3 years ago

Some further comments that where raised during the meeting yesterday:

(please add anything I forgot to mention or might have misunderstood)

proycon commented 2 years ago

I'm closing this now since discussion on the particular schema is a phase in the past that is less relevant now