Open functicons opened 5 years ago
I think it's good to separate FlinkCluster's spec.job into separate CRDs. If you create separate CRDs such as FlinkJob and BeamJob, and add controllers for each, it seems to be a much more extensible structure.
In particular, Beam support - #64, #174 - will increase complexity if the current FlinkCluster CRD & controller supports it. I hope this issue is addressed to keep the operator simple and extensible.
Previously, I thought about job CRD even for job clusters, but that means users have to deal with 2 CRDs, but I want to keep things simple, that's why I embedded jobSpec in the FlinkCluster CRD.
I agree to keep it simple. In the current implementation, handling flink job with the FlinkCluster spec seems simple. However, as time goes by, I'm a little concerned that the current structure will be more complex as the community wants more functionality like Beam support. The seamless job update I suggested would have been less expensive to add functionality if there was a separate FlinkJob. I don't think it's urgent to seperate FlinkJob from the FlinkCluster now.
However, I don't think two CRDs will be burdensome for users. Just like the k8s apps/deployment spec includes a pod template. If the FlinkJob spec includes the FlinkCluster spec, the user who wants the job cluster will only need to handle the FlinkJob CR when creating the job cluster. In this case, you will not have to link FlinkJob to FlinkCluster by creating FlinkJob and FlinkCluster respectively. For example:
apiVersion: flinkoperator.k8s.io/v1beta1
kind: FlinkJob
metadata:
name: flink-job
spec:
jarFile: ./examples/streaming/WordCount.jar
className: org.apache.flink.streaming.examples.wordcount.WordCount
flinkClusterTemplate:
image:
name: flink:1.8.1
jobManager:
taskManager:
replicas: 2
flinkProperties:
With job CRD, how do you model Beam job?
When there was a branch in the reconcile logic between the Flink job and the Beam job, I was concerned about the complexity of supporting one CRD. If a beam job is supported via FlinkRunner without a job server, it seem there is no need to separate the BeamJob CRD at this time. In the future, it may be necessary to separate FlinkJob CRDs from FlinkCluster if they become feature rich, but I don't think it is urgent as mentioned in the previous comments.
Currently after a session cluster is up and running, we have to submit jobs to the cluster through Flink's API endpoint, which usually means the user has to install Flink CLI on their local machine.
We could consider adding a new CRD and controller for job submission, after a job CR is created the controller gets notified and submit the job to the cluster automatically, it also polls the job status and update the CR. We might also migrate the current job submission to job cluster to the approach as well.