spring-cloud / spring-cloud-dataflow

A microservices-based Streaming and Batch data processing in Cloud Foundry and Kubernetes
https://dataflow.spring.io
Apache License 2.0
1.11k stars 580 forks source link

change task namespace deployment dynamically in Kubernetes without restarting scdf #4880

Open mcisnerosb57 opened 2 years ago

mcisnerosb57 commented 2 years ago

We are using SCDF to perform the deployment of the BATCH services (TASK's). We have found that the only way to include a new namespace in openshift for spring-cloud-data-flow is to use the SCDF accounts in the application.yml as follows:

  cloud:
    dataflow:
      task:
        platform:
          kubernetes:
            accounts:
              default:
                namespace: sanes-nhb-serverless-dev
                entryPointStyle: exec
                limits:
                  memory: 1024Mi
                deploymentServiceAccountName: scdf-sa

The problem it present to us is the necessity of restarting the POD every time we add a new namespace in SCDF in which we want to deploy a task. This is not feasible for our model.

We propose to make an evolution in which, in addition to using the default value of the accounts, the value of the namespace can be dynamically overwritten when launching a task with the already existing property "deployer.test.kubernetes.namespace". (We do not know if the use of this property actually does something, since we have not detected any change when used, it always deploys us using the default account.) ¿it´s possible to do this?. thanks of advance

mcisnerosb57 commented 2 years ago

@onobc @markpollack What we have verified is that in large projects with multiple Openshift Namespaces we have the problem that adding an account from a new namespace puts at risk the launch of the tasks of the other namespaces, which can be hundreds. So we need to able to add accounts without a POD reboot .

markpollack commented 2 years ago

We will look into this, thanks for reporting.

cppwfs commented 2 years ago

Currently each platform has a TaskLauncher, and this TaskLauncher has an assigned KubernetesClient which is assigned a namespace. These are all configured at Bean creation time. The solution would be to create a new KubernetesClient for each task launch instead of using the one that was created at TaskLauncher creation. And thus this KubernetesClient could then be assigned the namespace that is set via deployment properties by the user. This will require a change to the KubernatesDeployer.

cmiquelg commented 2 years ago

Hi @cppwfs , Is this feature being built? I'm also interested in this SCDF behavior and I would like to work on it if you are not already on it. Thanks!

markpollack commented 2 years ago

Hi, currently triaging the priority backlog for the next release. We haven't started this. I believe I understand the issue as described, but not sure yet of what would be the best design. We would probably have to move away from the accounts being statically configured and create entries in the database which would also require the usual CRUD access (with protections for delete), a UI screen to manage the account. If we do it for tasks, we will also need to do it for streams. Open to your suggestions and also your contributions!

cmiquelg commented 2 years ago

Hi,

I totally agree on your proposed solution and I will start working on it, at least the 'backend' part of it. I lack of frontend knowledge and I feel like I would not do a great job.

Thanks!

markpollack commented 2 years ago

can you clarify on the comment

we have the problem that adding an account from a new namespace puts at risk the launch of the tasks of the other namespaces

how does adding a namespace to the account put at risk the launch of other tasks?

While we could simply add this property to the deployer, but I believe there maybe other areas where acceptable choices are compared against the list of namespaces across the accounts, so adding it 'on the fly' would not be a good solution taking into account the whole lifecycle of the task and how other parts of the task api work.

cmiquelg commented 1 year ago

Hi @markpollack

With the current SCDF behavior, when we want to add a new account to deploy a batch in a new namespace, we have to add it to the configuration and redeploy the pod, which means the SCDF service will not be available during restart period. This means that during that period, no batches can be launched. In a production environment, with a large number of batches being launched simultaneously, this can be a problem.

The solution we are working on right now and we understood it was what you proposed is:

I don't think I completely understand your point since I think this solution fills all the requirements.

Thank you!

khaeghar commented 1 year ago

Hi @markpollack ,

We've created a new branch with the solution @cmiquelg proposed, following each bullet.

Here's the PR: https://github.com/spring-cloud/spring-cloud-dataflow/pull/5142

Anything else that needs to be done, please tell me :)