GoogleCloudPlatform / flink-on-k8s-operator

[DEPRECATED] Kubernetes operator for managing the lifecycle of Apache Flink and Beam applications.
Apache License 2.0
658 stars 266 forks source link

Improve submitting/tracking job and fix #379

Closed elanv closed 3 years ago

elanv commented 3 years ago

Changes

Resolves #294

elanv commented 3 years ago

I will add and fix some tests soon.

functicons commented 3 years ago

/gcbrun

elanv commented 3 years ago

Restored a CRD that was accidentally changed. Removed JobManager check start delay from submit script. This is because the time difference between the submitter and JM initialization is small and sometimes the submitter can take longer, such as downloading large file.

functicons commented 3 years ago

/gcbrun

functicons commented 3 years ago

I just did a test, the sample job finished successfully, but the status in the CR was not quite right, it was still in Pending status:

Status:
  Components:
    Job:
      Id:     af4f6808f9c78597623a626e300c73f5
      Name:   flinkjobcluster-sample-job
      State:  Pending
  ...
  Current Revision:  flinkjobcluster-sample-5d96cb58dd-1
  Last Update Time:  2020-12-06T04:55:33Z
  Next Revision:     flinkjobcluster-sample-5d96cb58dd-1
  State:             Stopped
elanv commented 3 years ago

Thanks for your review of something I missed. Fixed finished job related issues.

Will proceed remaining works of renaming and docs.

functicons commented 3 years ago

Unit test is failing, could you fix it?

E1207 05:00:42.876848    3516 factory.go:35] Failed initializing volcano batch scheduler: invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable
--- FAIL: TestGetDesiredClusterState (0.01s)
    flinkcluster_converter_test.go:889: assertion failed: 
        --- desiredState.Job
        +++ expectedDesiredJob
          v1.Job{
                TypeMeta: v1.TypeMeta{},
                ObjectMeta: v1.ObjectMeta{
        -               Name:         "flinkjobcluster-sample-job-submitter",
        +               Name:         "flinkjobcluster-sample-job",
                        GenerateName: "",
                        Namespace:    "default",
                        ... // 13 identical fields
                },
functicons commented 3 years ago

I noticed another problem, in the current job status, it shows x-job-submitter in Succeeded status. It unclear whether the status is for the job submitter or the job. Maybe remove the Name field to make it less confusing?

Status:
    Components:
        Job:
            Id:     b2a57bdfdc5127ddbab05da9ec438168
            Name:   flinkjobcluster-sample-job-submitter
            State:  Succeeded
elanv commented 3 years ago

I noticed another problem, in the current job status, it shows x-job-submitter in Succeeded status. It unclear whether the status is for the job submitter or the job. Maybe remove the Name field to make it less confusing?

That's right. I think it would be better to remove it too. Or it would be nice to keep the field and make it optional for later use, and not set that value now.

functicons commented 3 years ago

/gcbrun