spring-cloud / spring-cloud-dataflow

A microservices-based Streaming and Batch data processing in Cloud Foundry and Kubernetes
https://dataflow.spring.io
Apache License 2.0
1.11k stars 580 forks source link

Create a separate dataflow database migration task application #5956

Open corneil opened 13 hours ago

corneil commented 13 hours ago

Problem description: When a release contains a large set of changes that may run longer than 1 minute Kubernetes may kill the application and relaunch.

Solution description: The solution will be to create a task application that can apply the migrations and also output sql where possible for migrations that need to be applied. In the cases of migration which perform logic that cannot be output as sql only the output will stop and a message indicating the need to run the task afterwards.

We can also have the task list all migrations need by name and the use specify the target migration to apply and stop which will result in executing all the outstanding migrations including the target.

This task application can be packaged as a container that can be launched as a job before a deployment of spring cloud data flow and skipper.

We can include the migrations of both dataflow-server and skipper-server in the application or keep them as separate task applications.

cppwfs commented 11 hours ago

We also need add instructions in the migration guide that refer to potential of a longer than normal startup because of schema migration. This would include the use of the task or a set of migration scripts for the supported databases.