Best practice proposal: A CNFs containers should have one process category (or type)

taylor commented 1 year ago

A microservice should have only one process (or set of parent/child processes) that is managed by a non home grown supervisor or orchestrator. The microservice should not spawn other process types (e.g. executables) as a way to contributeto the workload but rather should interact with other processes through a microservice API. from CNF Test Suite

WIP / draft proposal

Reference(s):

taylor commented 1 year ago

Feedback requested on this as a best practice @iawells @electrocucaracha @wavell @denverwilliams @fkautz @michaelspedersen @CsatariGergely @ildikov

taylor commented 1 year ago

@tomkivlin

tomkivlin commented 1 year ago

Link to rationale in cnf-test-suite: https://github.com/cncf/cnf-testsuite/blob/main/RATIONALE.md#to-check-if-the-cnf-has-multiple-process-types-within-one-container-single_process_type

CsatariGergely commented 1 year ago

I think this should not be a best practice and it should not be an essential test in CNF Conformance. Complex applications, like telco ones have several situations where more than one process is needed by a microservice as there are lots of networking and storage things happen while we are in a time constrained environment. Organizing each process type to a separate container leads to scaling and dimensioning issues and waste of resources. Scaling of processes is different than containers, processes are lifecycle managed and monitored by the Linux kernel while containers are handled by the container runtime including some feedback to controller via kubelet what makes containers less scalable than processes. When we have each process type in separate containers we need to dimension each and every of these separately what eventually leads small resource requests for containers and invariability in container performance. To overcome this resource requests are increased and the utilization and performance of the whole CNF will be low. Containers are just more heavy than processes, so this practice results in waste of resources. All in all I do not see this best practice to be justified and I think we should not have it. I propose to not have this as an essential check in CNCF CNF Certification (https://github.com/cncf/cnf-testsuite/pull/1715)

electrocucaracha commented 1 year ago

I think this should not be a best practice and it should not be an essential test in CNF Conformance. Complex applications, like telco ones have several situations where more than one process is needed by a microservice as there are lots of networking and storage things happen while we are in a time constrained environment. Organizing each process type to a separate container leads to scaling and dimensioning issues and waste of resources. Scaling of processes is different than containers, processes are lifecycle managed and monitored by the Linux kernel while containers are handled by the container runtime including some feedback to controller via kubelet what makes containers less scalable than processes. When we have each process type in separate containers we need to dimension each and every of these separately what eventually leads small resource requests for containers and invariability in container performance. To overcome this resource requests are increased and the utilization and performance of the whole CNF will be low. Containers are just more heavy than processes, so this practice results in waste of resources. All in all I do not see this best practice to be justified and I think we should not have it. I propose to not have this as an essential check in CNCF CNF Certification (cncf/cnf-testsuite#1715)

@CsatariGergely You made some valid points to this proposal, some of them are also captured in s6 README doc. As you mentioned, a microservice can be composed by one or multiple processes/containers, there is no architectural enforcement on using only one container or one process per container.

Regarding running containers, Docker monitors the PID 1 application process and uses its signals to report events, knowing when a container has stopped. Even running a single process properly can be trivial in some cases, specially for managing graceful termination.

So maybe the recommendation here is to run as many process as you need (preferably one), as long as the ENTRYPOINT has an init process managed properly, this can also be done by a process manager like supervisord, runit, monit, tini/dumb-init or s6.

AFAIK, the default Linux Kernel scheduler used by containers is the Completely Fair Scheduler (CFS), so cgroups in CFS uses time slices instead of processor counts.

Note: I found some places that refers to this as Single Concern principle

PS: One missing advantage to run a single process per container is easier to identify logs on it.

tomkivlin commented 1 year ago

@CsatariGergely do you have any thoughts on the above?

lixuna commented 1 year ago

FYI - [Documentation] Single process per container (microservice) rational paper #1721 https://github.com/cncf/cnf-testsuite/issues/1721

tomkivlin commented 1 year ago

Focus on it being a recommendation, about splitting up process types - i.e. multiple processes of a type in the same pod/container might be fine.

sparso commented 1 year ago

Matrixx developers agree with the proposal to focus on this being a recommendation about splitting up process types, as mentioned above, and looking for a single ENTRYPOINT with a properly managed init process. We agree that the preference is to run a single process if possible, but in some cases we do need supporting processes that run together in a container.

The proposed change would ensure this test is checking a container can be managed properly in a Kubernetes environment, rather than enforcing an architectural principle, which we think is more useful.

lixuna commented 1 year ago

FYI - More documentation available at https://www.infoq.com/articles/cloud-native-network-functions-concern/?

taylor commented 1 year ago

A related set of new tests https://github.com/cncf/cnf-testsuite/issues/1795 will reduce the weight of the single process type test after they are introduced.

iawells commented 1 year ago

Can someone please explain the purpose of this to me?

Let's start with, can you define 'process type' to me? Because I'm confused here - you agreed that running supervisord to run processes was sensible and I struggle to see how supervisord and the processes it runs can be considered the same type. And why is it not best practice - that is, why is it a worse practice - to run processes of a different 'type'? What is the role of person that it's helping?

You've also ruled out spawning and orphaning processes per the testsuite (which is is a pretty natural thing to do), spawning subprocesses that are of a different 'type' (so I can't run a script that runs 'ip', then?) and the list goes on.

As best practices go, this one (as described in this discussion, I ay need to do more background reading) comes across as a bit limited. It seems like this looks like a good idea, but I'm not sure anyone has articulated why it's a good idea, in what circumstances it's a good idea, etc. To then add a test for conformance with it without getting to the bottom of this detail seems like you're racing to the finish line without being certain you're even running in the right direction. Where should I be looking to find out why you want it?

taylor commented 1 year ago

@iawells please read https://www.infoq.com/articles/cloud-native-network-functions-concern/

TODO: Pull more content directly into this proposal directly (can go to an external draft document).

iawells commented 1 year ago

@taylor Indeed, if that is the basis for the argument, it needs to be in here, but I don't think it makes a strong argument based on the counterpoints I've made here (and I saw it had its own comments on the article, too).

Seems like the proposal's hypothesis is there should be a rule something like 'one process per container', and accepts that that is not quite the rule we're looking for; but there are lots of processes (including within k8s itself, given golang favours running external processes for CNIs and such) that don't hold up to a standard like that, and the language used ('process type') did not clear up the distinction Watson was trying to make.

You'd also have to argue more clearly why running multiple processes (of whatever sort) was a 'worse' practice, because a 'best' practice needs to be better than the alternatives.

taylor commented 1 year ago

This is not about the number of processes or threads in a container. This is having one concern per container. If the "application" with a single concern uses multiple processes, multiple threads, or a single process it would all be compatible with this best practice. This is a good summary:

“It’s best practice to separate areas of concern by using one service per container. That service may fork into multiple processes (for example, Apache web server starts multiple worker processes). It’s ok to have multiple processes, but to get the most benefit out of Docker, avoid one container being responsible for multiple aspects of your overall application.” https://docs.docker.com/config/containers/multi-service_container/

taylor commented 1 year ago

@iawells lets look at your example:

spawning subprocesses that are of a different 'type' (so I can't run a script that runs 'ip', then?)

If the command "ip" is related to the single concern of the parent process this seems fine. I would consider the command ip to be more similar to a library call or something. Maybe an vpp-based IPsec container runs ip to gather information as part of its auto-configuration.

Again the issue is not multiple processes it is about the container having a single concern.

As we are talking about microservice best practices and with the single concern being about orchestration as one primary benefit for single concerns... it is implicit that that concern is limited / smaller vs a vague big concern "network service endpoint" as the concern with lots of component parts. An AMF has many components which can be broken into multiple containers and Pods for instance.

For that IPSec container using "ip" to gather some information if it were to start up a MySQL database to store data (possibly related to connections) on the container itself, the MySQL database would be a separate concern and it is recommended to have it running elsewhere and connect to it.

An AMF could be broken into its concerns and those could be implemented as microservices that can leverage system orchestration, scaling, and provide expanded-shared functionality from those microservices with independence and maintenance for these smaller components.

taylor commented 1 year ago

This proposal is about the Single concern principle, SoC (separation of concerns), and Single-responsibility principle as they relate to microservices.

Single Concern principle:

In many ways, the single concern principle is like the single responsibility principle from SOLID, which advises that a class must have only one responsibility. The motivation behind the single responsibility principle is that each responsibility is an axis of change, and a class has only one reason to change.

https://www.ibm.com/cloud/architecture/architecture/practices/cloud-native-principles

Single concern microservice Having a single concern means that a microservice should do one thing and one thing only. For example, if the microservice is intended to support authentication, it should do authentication only. This means that its interface should expose only access points that are relevant to authentication. And internally, the microservice should have authentication behavior only. For example, there should be no side behavior such as providing employee contact information in the authentication response.

Having a single concern makes the microservice easier to maintain and scale.

https://developers.redhat.com/articles/2022/01/11/5-design-principles-microservices#five_design_principles_for_microservices

TODO:

consider changing the wording for this best practice proposal to focus on one concern for each container represented in its processes.

process type -> process category -> processes and/or threads should be related to a single concern

taylor commented 1 year ago

As a step towards a fully autonomous network and achieving an intent-based management of a network, its architecture must be prepared by raising the level of abstraction in management with e.g., strong separation of concerns.

https://www.ericsson.com/en/future-technologies/architecture/network-capabilities

Programmable networks are a big topic for the networking domain and the telecom sector right now which includes automation and distributed management of those networks as a core part attribute. Separation of concerns is a critical part of this.

lixuna commented 1 year ago

Can this issue be closed?

tomkivlin commented 1 year ago

Yes - completed by #267

lfn-cnti / bestpractices

Best practice proposal: A CNFs containers should have one process category (or type) #242