pingcap / tiflow

This repo maintains DM (a data migration platform) and TiCDC (change data capture for TiDB)
Apache License 2.0
417 stars 272 forks source link

inconsistent job status when query from framework and business #7439

Open lance6716 opened 1 year ago

lance6716 commented 1 year ago

What did you do?

$ date && curl http://127.0.0.1:10240/api/v1/jobs/64d3ab26-94bd-4136-a38a-da05e86fed46
Mon Oct 24 03:46:47 PM CST 2022
{"id":"64d3ab26-94bd-4136-a38a-da05e86fed46", "type":"DM", "state":"Running", "config":"dGFzay1tb2RlOiBhbGwKY29sbGF0aW9uX2NvbXBhdGlibGU6ICJzdHJpY3QiCnRhcmdldC1kYXRhYmFzZToKICBob3N0OiBob3N0LmRvY2tlci5pbnRlcm5hbAogIHBvcnQ6IDQwMDAKICB1c2VyOiByb290CiAgcGFzc3dvcmQ6ICcnCnVwc3RyZWFtczoKICAtIGRiLWNvbmZpZzoKICAgICAgaG9zdDogaG9zdC5kb2NrZXIuaW50ZXJuYWwKICAgICAgcG9ydDogMzMwNgogICAgICB1c2VyOiByb290CiAgICAgIHBhc3N3b3JkOiAnJwogICAgc291cmNlLWlkOiBteXNxbC0wMQogICAgYmxvY2stYWxsb3ctbGlzdDogZ2xvYmFsCiAgLSBkYi1jb25maWc6CiAgICAgIGhvc3Q6IGhvc3QuZG9ja2VyLmludGVybmFsCiAgICAgIHBvcnQ6IDMzMDcKICAgICAgdXNlcjogcm9vdAogICAgICBwYXNzd29yZDogJycKICAgIHNvdXJjZS1pZDogbXlzcWwtMDIKICAgIGJsb2NrLWFsbG93LWxpc3Q6IGdsb2JhbApibG9jay1hbGxvdy1saXN0OgogIGdsb2JhbDoKICAgIGRvLWRiczoKICAgIC0gInN5bmNfY29sbGF0aW9uKiIKICAgIC0gInRlc3RfcGFuaWMiCg==", "detail":"", "error":{"code":"None", "message":"", "not_leader":null}, "selectors":[]}%      

$ date && curl http://127.0.0.1:10240/api/v1/jobs/64d3ab26-94bd-4136-a38a-da05e86fed46/status
Mon Oct 24 03:46:51 PM CST 2022
{"code":9, "details":[{"@type":"type.googleapis.com/enginepb.ErrorV2", "name":"JobNotRunningError", "details":"eyJKb2JJRCI6IjY0ZDNhYjI2LTk0YmQtNDEzNi1hMzhhLWRhMDVlODZmZWQ0NiJ9", "stack_trace":"\ngithub.com/pingcap/tiflow/engine/pkg/rpcerror.(*Prototype[...]).GenWithStack\n\tgithub.com/pingcap/tiflow/engine/pkg/rpcerror/normalize.go:101\ngithub.com/pingcap/tiflow/engine/servermaster.(*JobManagerImpl).GetJobMasterForwardAddress\n\tgithub.com/pingcap/tiflow/engine/servermaster/jobmanager.go:486\ngithub.com/pingcap/tiflow/engine/servermaster.(*Server).handleForwardJobAPI\n\tgithub.com/pingcap/tiflow/engine/servermaster/server.go:728\ngithub.com/pingcap/tiflow/engine/servermaster.(*Server).forwardJobAPI\n\tgithub.com/pingcap/tiflow/engine/servermaster/server.go:699\ngithub.com/pingcap/tiflow/engine/servermaster.registerRoutes.func1\n\tgithub.com/pingcap/tiflow/engine/servermaster/http.go:45\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2109\nnet/http.(*ServeMux).ServeHTTP\n\tnet/http/server.go:2487\nnet/http.serverHandler.ServeHTTP\n\tnet/http/server.go:2947\nnet/http.(*conn).serve\n\tnet/http/server.go:1991\nruntime.goexit\n\truntime/asm_amd64.s:1594"}]}%     

What did you expect to see?

No response

What did you see instead?

executor log

[2022/10/24 07:46:45.634 +00:00] [INFO] [version.go:47] ["Welcome to TiFlow Executor"] [release-version=v6.3.0-master-dirty] [git-hash=48ae7f944cd6fe9d18569c236c3c54a28212f86b] [git-branch=gc-tuner] [utc-build-time="2022-10-24 07:41:03"] [go-version="go version go1.19.2 linux/amd64"] [failpoint-build=false]
[2022/10/24 07:46:45.634 +00:00] [INFO] [server.go:100] ["creating executor"] [config="{\"name\":\"server-executor\",\"log\":{\"level\":\"debug\",\"file\":\"/log/server-executor-0.log\",\"max-size\":0,\"max-days\":0,\"max-backups\":0,\"error-output\":\"\"},\"join\":\"server-master:10240\",\"addr\":\"0.0.0.0:10241\",\"advertise-addr\":\"server-executor:10241\",\"labels\":null,\"exclusively-deployed\":true,\"keepalive-ttl\":\"20s\",\"keepalive-interval\":\"500ms\",\"rpc-timeout\":\"3s\",\"security\":null}"]
[2022/10/24 07:46:45.634 +00:00] [INFO] [registry.go:68] ["register worker"] [worker-type=CVSTask]
[2022/10/24 07:46:45.634 +00:00] [INFO] [registry.go:68] ["register worker"] [worker-type=CVSJobMaster]
[2022/10/24 07:46:45.634 +00:00] [INFO] [registry.go:68] ["register worker"] [worker-type=DMJobMaster]
[2022/10/24 07:46:45.634 +00:00] [INFO] [registry.go:68] ["register worker"] [worker-type=DMDumpTask]
[2022/10/24 07:46:45.634 +00:00] [INFO] [registry.go:68] ["register worker"] [worker-type=DMLoadTask]
[2022/10/24 07:46:45.634 +00:00] [INFO] [registry.go:68] ["register worker"] [worker-type=DMSyncTask]
[2022/10/24 07:46:45.634 +00:00] [INFO] [registry.go:68] ["register worker"] [worker-type=FakeJobMaster]
[2022/10/24 07:46:45.634 +00:00] [INFO] [registry.go:68] ["register worker"] [worker-type=FakeTask]
[2022/10/24 07:46:45.635 +00:00] [INFO] [server.go:607] ["master client init successful"] [server-addrs=server-master:10240]
[2022/10/24 07:46:45.637 +00:00] [INFO] [server.go:629] ["register successful"] [executor-id=server-executor-ea266d22]
[2022/10/24 07:46:45.637 +00:00] [INFO] [file_manager.go:285] ["Configured local file directory does not existing, try to create one"] [dir=/tmp/dfe-storage/server-executor-ea266d22]
[2022/10/24 07:46:45.637 +00:00] [INFO] [file_manager.go:300] ["Local file directory disk info"] [disk-info="{All: 468GB; Used: 338GB; Free: 130GB; Available: 106GB; Available Percentage: 22.649572%}"]
[2022/10/24 07:46:45.637 +00:00] [INFO] [broker.go:102] ["Create new resource broker"] [executor-id=server-executor-ea266d22] [config="{\"local\":{\"base-dir\":\"/tmp/dfe-storage/server-executor-ea266d22\"},\"s3\":{\"endpoint\":\"\",\"region\":\"\",\"storage-class\":\"\",\"sse\":\"\",\"sse-kms-key-id\":\"\",\"acl\":\"\",\"access-key\":\"\",\"secret-access-key\":\"\",\"provider\":\"\",\"force-path-style\":true,\"use-accelerate-endpoint\":false,\"role-arn\":\"\",\"external-id\":\"\",\"object-lock-enabled\":false,\"bucket\":\"\",\"prefix\":\"\"}}"]
[2022/10/24 07:46:45.637 +00:00] [INFO] [broker.go:125] ["broker will not use s3 as external storage since s3 is not configured"]
[2022/10/24 07:46:45.638 +00:00] [INFO] [server.go:562] ["listen address"] [addr=0.0.0.0:10241]
[2022/10/24 07:46:45.638 +00:00] [INFO] [metastore.go:141] ["Obtained framework metastore endpoint"] [addr="{\"store-id\":\"_root\",\"store-type\":\"mysql\",\"endpoints\":[\"mysql-standalone:3306\"],\"user\":\"root\",\"password\":\"\",\"schema\":\"test_framework\",\"read-timeout\":\"3s\",\"write-timeout\":\"3s\",\"dial-timeout\":\"3s\",\"dbconfs\":{\"conn-max-idle-time\":60000000000,\"conn-max-life-time\":43200000000000,\"max-idle-conns\":3,\"max-open-conns\":7},\"security\":null}"]
[2022/10/24 07:46:45.638 +00:00] [INFO] [metastore.go:161] ["Obtained business metastore endpoint"] [addr="{\"store-id\":\"_default\",\"store-type\":\"mysql\",\"endpoints\":[\"mysql-standalone:3306\"],\"user\":\"root\",\"password\":\"\",\"schema\":\"test_business\",\"read-timeout\":\"3s\",\"write-timeout\":\"3s\",\"dial-timeout\":\"3s\",\"dbconfs\":{\"conn-max-idle-time\":60000000000,\"conn-max-life-time\":43200000000000,\"max-idle-conns\":3,\"max-open-conns\":7},\"security\":null}"]
[2022/10/24 07:46:45.638 +00:00] [DEBUG] [server.go:479] ["update p2p msg router"] [event="{\"Tp\":\"add\",\"Node\":{\"Tp\":\"executor\",\"ID\":\"server-executor-4a4f9690\",\"Addr\":\"server-executor:10241\"}}"]
[2022/10/24 07:46:45.639 +00:00] [DEBUG] [server.go:479] ["update p2p msg router"] [event="{\"Tp\":\"add\",\"Node\":{\"Tp\":\"executor\",\"ID\":\"server-executor-ea266d22\",\"Addr\":\"server-executor:10241\"}}"]
[2022/10/24 07:46:45.639 +00:00] [DEBUG] [server.go:519] ["update executor client group"] [event="{\"Tp\":\"add\",\"Node\":{\"Tp\":\"executor\",\"ID\":\"server-executor-4a4f9690\",\"Addr\":\"server-executor:10241\"}}"]
[2022/10/24 07:46:45.639 +00:00] [DEBUG] [server.go:479] ["update p2p msg router"] [event="{\"Tp\":\"add\",\"Node\":{\"Tp\":\"master\",\"ID\":\"server-master-fc1e457a\",\"Addr\":\"server-master:10240\"}}"]
[2022/10/24 07:46:45.639 +00:00] [INFO] [executor_group.go:241] ["executor client added"] [executor-id=server-executor-4a4f9690] [address=server-executor:10241]
[2022/10/24 07:46:45.639 +00:00] [DEBUG] [server.go:519] ["update executor client group"] [event="{\"Tp\":\"add\",\"Node\":{\"Tp\":\"executor\",\"ID\":\"server-executor-ea266d22\",\"Addr\":\"server-executor:10241\"}}"]
[2022/10/24 07:46:45.639 +00:00] [INFO] [executor_group.go:241] ["executor client added"] [executor-id=server-executor-ea266d22] [address=server-executor:10241]
[2022/10/24 07:46:46.139 +00:00] [INFO] [server.go:688] ["heartbeat success"] [leader=server-master:10240] [members="[server-master:10240]"]
[2022/10/24 07:46:50.640 +00:00] [INFO] [leader_resolver.go:190] ["leader resolver state updated"] [server-list="{\"server-master:10240\":true}"]
[2022/10/24 07:46:51.139 +00:00] [INFO] [server.go:688] ["heartbeat success"] [leader=server-master:10240] [members="[server-master:10240]"]
[2022/10/24 07:46:56.140 +00:00] [INFO] [server.go:688] ["heartbeat success"] [leader=server-master:10240] [members="[server-master:10240]"]
[2022/10/24 07:47:01.640 +00:00]

master log

[2022/10/24 07:46:45.635 +00:00] [INFO] [executor_manager.go:184] ["allocate new executor"] [executor="name:\"server-executor\"  address:\"server-executor:10241\"  capability:100"]
[2022/10/24 07:46:45.635 +00:00] [INFO] [executor_manager.go:163] ["register executor"] [executor="{\"seq-id\":0,\"created-at\":\"0001-01-01T00:00:00Z\",\"updated-at\":\"0001-01-01T00:00:00Z\",\"id\":\"server-executor-ea266d22\",\"name\":\"server-executor\",\"address\":\"server-executor:10241\",\"capability\":100,\"labels\":{}}"]
[2022/10/24 07:46:45.635 +00:00] [INFO] [capacity_impl.go:54] ["executor resource is registered"] [executor-id=server-executor-ea266d22] [capacity=100]
[2022/10/24 07:46:45.636 +00:00] [INFO] [executor_group.go:241] ["executor client added"] [executor-id=server-executor-ea266d22] [address=server-executor:10241]
[2022/10/24 07:46:45.636 +00:00] [INFO] [watch_executors.go:97] ["add executor"] [id=server-executor-ea266d22] [addr=server-executor:10241]
[2022/10/24 07:46:45.637 +00:00] [DEBUG] [callbacks.go:134] ["trace log"] [component=gorm] [elapsed=1.087395ms] [sql="INSERT INTO `executors` (`created_at`,`updated_at`,`id`,`name`,`address`,`capability`,`labels`) VALUES ('2022-10-24 07:46:45.636','2022-10-24 07:46:45.636','server-executor-ea266d22','server-executor','server-executor:10241',100,'{}')"] [affected-rows=1]
[2022/10/24 07:46:45.637 +00:00] [DEBUG] [server.go:1129] ["request handled successfully"] [method=/enginepb.Discovery/RegisterExecutor] [request="executor:{name:\"server-executor\"  address:\"server-executor:10241\"  capability:100}"] [response="id:\"server-executor-ea266d22\"  name:\"server-executor\"  address:\"server-executor:10241\"  capability:100"]
[2022/10/24 07:46:45.637 +00:00] [DEBUG] [server.go:1129] ["request handled successfully"] [method=/enginepb.Discovery/QueryStorageConfig] [request=] [response="config:\"{\\\"local\\\":{\\\"base-dir\\\":\\\"\\\"},\\\"s3\\\":{\\\"endpoint\\\":\\\"\\\",\\\"region\\\":\\\"\\\",\\\"storage-class\\\":\\\"\\\",\\\"sse\\\":\\\"\\\",\\\"sse-kms-key-id\\\":\\\"\\\",\\\"acl\\\":\\\"\\\",\\\"access-key\\\":\\\"\\\",\\\"secret-access-key\\\":\\\"\\\",\\\"provider\\\":\\\"\\\",\\\"force-path-style\\\":true,\\\"use-accelerate-endpoint\\\":false,\\\"role-arn\\\":\\\"\\\",\\\"external-id\\\":\\\"\\\",\\\"object-lock-enabled\\\":false,\\\"bucket\\\":\\\"\\\",\\\"prefix\\\":\\\"\\\"}}\""]
[2022/10/24 07:46:45.638 +00:00] [DEBUG] [server.go:1129] ["request handled successfully"] [method=/enginepb.Discovery/QueryMetaStore] [request=tp:SystemMetaStore] [response="address:\"{\\\"store-id\\\":\\\"_root\\\",\\\"store-type\\\":\\\"mysql\\\",\\\"endpoints\\\":[\\\"mysql-standalone:3306\\\"],\\\"user\\\":\\\"root\\\",\\\"password\\\":\\\"\\\",\\\"schema\\\":\\\"test_framework\\\",\\\"read-timeout\\\":\\\"3s\\\",\\\"write-timeout\\\":\\\"3s\\\",\\\"dial-timeout\\\":\\\"3s\\\",\\\"dbconfs\\\":{\\\"conn-max-idle-time\\\":60000000000,\\\"conn-max-life-time\\\":43200000000000,\\\"max-idle-conns\\\":3,\\\"max-open-conns\\\":7},\\\"security\\\":null}\""]
[2022/10/24 07:46:45.638 +00:00] [DEBUG] [server.go:1129] ["request handled successfully"] [method=/enginepb.Discovery/QueryMetaStore] [request=tp:AppMetaStore] [response="address:\"{\\\"store-id\\\":\\\"_default\\\",\\\"store-type\\\":\\\"mysql\\\",\\\"endpoints\\\":[\\\"mysql-standalone:3306\\\"],\\\"user\\\":\\\"root\\\",\\\"password\\\":\\\"\\\",\\\"schema\\\":\\\"test_business\\\",\\\"read-timeout\\\":\\\"3s\\\",\\\"write-timeout\\\":\\\"3s\\\",\\\"dial-timeout\\\":\\\"3s\\\",\\\"dbconfs\\\":{\\\"conn-max-idle-time\\\":60000000000,\\\"conn-max-life-time\\\":43200000000000,\\\"max-idle-conns\\\":3,\\\"max-open-conns\\\":7},\\\"security\\\":null}\""]
[2022/10/24 07:46:45.638 +00:00] [DEBUG] [server.go:1129] ["request handled successfully"] [method=/enginepb.Discovery/ListMasters] [request=] [response="masters:{id:\"server-master-fc1e457a\"  name:\"server-master\"  address:\"server-master:10240\"  is_leader:true}"]
[2022/10/24 07:46:45.638 +00:00] [DEBUG] [server.go:1129] ["request handled successfully"] [method=/enginepb.Discovery/ListExecutors] [request=] [response="executors:{id:\"server-executor-4a4f9690\"  name:\"server-executor\"  address:\"server-executor:10241\"  capability:100}  executors:{id:\"server-executor-ea266d22\"  name:\"server-executor\"  address:\"server-executor:10241\"  capability:100}"]
[2022/10/24 07:46:45.667 +00:00] [INFO] [executor_manager.go:253] ["check alive"] [exec=server-executor-ea266d22]
[2022/10/24 07:46:46.139 +00:00] [DEBUG] [server.go:1129] ["request handled successfully"] [method=/enginepb.Discovery/Heartbeat] [request="executor_id:\"server-executor-ea266d22\"  status:1  timestamp:1666597606  ttl:23000"] [response="leader:\"server-master:10240\"  addrs:\"server-master:10240\""]
[2022/10/24 07:46:46.168 +00:00] [INFO] [executor_manager.go:253] ["check alive"] [exec=server-executor-4a4f9690]
[2022/10/24 07:46:46.387 +00:00] [DEBUG] [elector.go:117] [renew] [cost=1.770265ms] []
[2022/10/24 07:46:46.640 +00:00] [DEBUG] [server.go:1129] ["request handled successfully"] [method=/enginepb.Discovery/Heartbeat] [request="executor_id:\"server-executor-ea266d22\"  status:1  timestamp:1666597606  ttl:23000"] [response="leader:\"server-master:10240\"  addrs:\"server-master:10240\""]
[2022/10/24 07:46:47.139 +00:00] [INFO] [executor_manager.go:134] ["handle heart beat"] [req="executor_id:\"server-executor-ea266d22\"  status:1  timestamp:1666597607  ttl:23000"]
[2022/10/24 07:46:47.139 +00:00] [DEBUG] [server.go:1129] ["request handled successfully"] [method=/enginepb.Discovery/Heartbeat] [request="executor_id:\"server-executor-ea266d22\"  status:1  timestamp:1666597607  ttl:23000"] [response="leader:\"server-master:10240\"  addrs:\"server-master:10240\""]
[2022/10/24 07:46:47.640 +00:00] [DEBUG] [server.go:1129] ["request handled successfully"] [method=/enginepb.Discovery/Heartbeat] [request="executor_id:\"server-executor-ea266d22\"  status:1  timestamp:1666597607  ttl:23000"] [response="leader:\"server-master:10240\"  addrs:\"server-master:10240\""]
[2022/10/24 07:46:47.683 +00:00] [DEBUG] [callbacks.go:134] ["trace log"] [component=gorm] [elapsed=730.474µs] [sql="SELECT * FROM `master_meta` WHERE id = '64d3ab26-94bd-4136-a38a-da05e86fed46' AND `master_meta`.`deleted` IS NULL ORDER BY `master_meta`.`seq_id` LIMIT 1"] [affected-rows=1]
[2022/10/24 07:46:47.683 +00:00] [INFO] [job_client.go:57] ["get job detail from job master"] [url=http://server-executor:10241/api/v1/jobs/64d3ab26-94bd-4136-a38a-da05e86fed46/status]
[2022/10/24 07:46:47.685 +00:00] [DEBUG] [job_client.go:62] ["job master response"] ["response status"="404 Not Found"] [url=http://server-executor:10241/api/v1/jobs/64d3ab26-94bd-4136-a38a-da05e86fed46/status]
[2022/10/24 07:46:47.685 +00:00] [WARN] [jobmanager.go:410] ["get job detail from jobmaster fail"] [error="[DFLOW:ErrJobManagerRespStatusCode404]get job detail response status code is 404"] [errorVerbose="[DFLOW:ErrJobManagerRespStatusCode404]get job detail response status code is 404\ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/errors.go:174\ngithub.com/pingcap/errors.(*Error).GenWithStackByArgs\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/normalize.go:164\ngithub.com/pingcap/tiflow/engine/pkg/httputil.(*jobHTTPClientImpl).GetJobDetail\n\tgithub.com/pingcap/tiflow/engine/pkg/httputil/job_client.go:69\ngithub.com/pingcap/tiflow/engine/servermaster.(*JobManagerImpl).GetJob\n\tgithub.com/pingcap/tiflow/engine/servermaster/jobmanager.go:229\ngithub.com/pingcap/tiflow/engine/servermaster.(*Server).GetJob\n\tgithub.com/pingcap/tiflow/engine/servermaster/server.go:225\ngithub.com/pingcap/tiflow/engine/enginepb.local_request_JobManager_GetJob_0\n\tgithub.com/pingcap/tiflow/engine/enginepb/master.pb.gw.go:257\ngithub.com/pingcap/tiflow/engine/enginepb.RegisterJobManagerHandlerServer.func2\n\tgithub.com/pingcap/tiflow/engine/enginepb/master.pb.gw.go:609\ngithub.com/grpc-ecosystem/grpc-gateway/v2/runtime.(*ServeMux).ServeHTTP\n\tgithub.com/grpc-ecosystem/grpc-gateway/v2@v2.11.0/runtime/mux.go:386\ngithub.com/pingcap/tiflow/engine/servermaster.registerRoutes.func1\n\tgithub.com/pingcap/tiflow/engine/servermaster/http.go:47\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2109\nnet/http.(*ServeMux).ServeHTTP\n\tnet/http/server.go:2487\nnet/http.serverHandler.ServeHTTP\n\tnet/http/server.go:2947\nnet/http.(*conn).serve\n\tnet/http/server.go:1991\nruntime.goexit\n\truntime/asm_amd64.s:1594"]
[2022/10/24 07:46:48.140 +00:00] [DEBUG] [server.go:1129] ["request handled successfully"] [method=/enginepb.Discovery/Heartbeat] [request="executor_id:\"server-executor-ea266d22\"  status:1  timestamp:1666597608  ttl:23000"] [response="leader:\"server-master:10240\"  addrs:\"server-master:10240\""]
[2022/10/24 07:46:48.429 +00:00] [DEBUG] [elector.go:117] [renew] [cost=41.762281ms] []
[2022/10/24 07:46:48.639 +00:00] [DEBUG] [server.go:1129] ["request handled successfully"] [method=/enginepb.Discovery/Heartbeat] [request="executor_id:\"server-executor-ea266d22\"  status:1  timestamp:1666597608  ttl:23000"] [response="leader:\"server-master:10240\"  addrs:\"server-master:10240\""]
[2022/10/24 07:46:49.145 +00:00] [DEBUG] [server.go:1129] ["request handled successfully"] [method=/enginepb.Discovery/Heartbeat] [request="executor_id:\"server-executor-ea266d22\"  status:1  timestamp:1666597609  ttl:23000"] [response="leader:\"server-master:10240\"  addrs:\"server-master:10240\""]
[2022/10/24 07:46:49.167 +00:00] [DEBUG] [callbacks.go:134] ["trace log"] [component=gorm] [elapsed=1.062365ms] [sql="SELECT * FROM `job_ops` WHERE op = '2'"] [affected-rows=0]
[2022/10/24 07:46:49.639 +00:00] [INFO] [server.go:199] [payload="executor_id:\"server-executor-ea266d22\"  status:1  timestamp:1666597609  ttl:23000"] [request=Heartbeat]
[2022/10/24 07:46:49.640 +00:00] [DEBUG] [server.go:1129] ["request handled successfully"] [method=/enginepb.Discovery/Heartbeat] [request="executor_id:\"server-executor-ea266d22\"  status:1  timestamp:1666597609  ttl:23000"] [response="leader:\"server-master:10240\"  addrs:\"server-master:10240\""]
[2022/10/24 07:46:50.139 +00:00] [DEBUG] [server.go:1129] ["request handled successfully"] [method=/enginepb.Discovery/Heartbeat] [request="executor_id:\"server-executor-ea266d22\"  status:1  timestamp:1666597610  ttl:23000"] [response="leader:\"server-master:10240\"  addrs:\"server-master:10240\""]
[2022/10/24 07:46:50.432 +00:00] [DEBUG] [elector.go:117] [renew] [cost=1.606549ms] []
[2022/10/24 07:46:50.639 +00:00] [DEBUG] [server.go:1129] ["request handled successfully"] [method=/enginepb.Discovery/ListMasters] [request=] [response="masters:{id:\"server-master-fc1e457a\"  name:\"server-master\"  address:\"server-master:10240\"  is_leader:true}"]
[2022/10/24 07:46:50.639 +00:00] [DEBUG] [server.go:1129] ["request handled successfully"] [method=/enginepb.Discovery/Heartbeat] [request="executor_id:\"server-executor-ea266d22\"  status:1  timestamp:1666597610  ttl:23000"] [response="leader:\"server-master:10240\"  addrs:\"server-master:10240\""]
[2022/10/24 07:46:50.639 +00:00] [DEBUG] [server.go:1129] ["request handled successfully"] [method=/enginepb.Discovery/ListMasters] [request=] [response="masters:{id:\"server-master-fc1e457a\"  name:\"server-master\"  address:\"server-master:10240\"  is_leader:true}"]
[2022/10/24 07:46:50.641 +00:00] [DEBUG] [server.go:1129] ["request handled successfully"] [method=/enginepb.Discovery/ListExecutors] [request=] [response="executors:{id:\"server-executor-4a4f9690\"  name:\"server-executor\"  address:\"server-executor:10241\"  capability:100}  executors:{id:\"server-executor-ea266d22\"  name:\"server-executor\"  address:\"server-executor:10241\"  capability:100}"]
[2022/10/24 07:46:50.667 +00:00] [INFO] [executor_manager.go:253] ["check alive"] [exec=server-executor-ea266d22]
[2022/10/24 07:46:51.135 +00:00] [DEBUG] [callbacks.go:134] ["trace log"] [component=gorm] [elapsed=1.396389ms] [sql="SELECT * FROM `master_meta` WHERE id = '64d3ab26-94bd-4136-a38a-da05e86fed46' AND `master_meta`.`deleted` IS NULL ORDER BY `master_meta`.`seq_id` LIMIT 1"] [affected-rows=1]
[2022/10/24 07:46:51.139 +00:00] [DEBUG] [server.go:1129] ["request handled successfully"] [method=/enginepb.Discovery/Heartbeat] [request="executor_id:\"server-executor-ea266d22\"  status:1  timestamp:1666597611  ttl:23000"] [response="leader:\"server-master:10240\"  addrs:\"server-master:10240\""]
[2022/10/24 07:46:51.168 +00:00] [INFO] [executor_manager.go:253] ["check alive"] [exec=server-executor-4a4f9690]
[2022/10/24 07:46:51.640 +00:00] [DEBUG] [server.go:1129] ["request handled successfully"] [method=/enginepb.Discovery/Heartbeat] [request="executor_id:\"server-executor-ea266d22\"  status:1  timestamp:1666597611  ttl:23000"] [response="leader:\"server-master:10240\"  addrs:\"server-master:10240\""]
[2022/10/24 07:46:52.139 +00:00] [INFO] [executor_manager.go:134] ["handle heart beat"] [req="executor_id:\"server-executor-ea266d22\"  status:1  timestamp:1666597612  ttl:23000"]
[2022/10/24 07:46:52.139 +00:00] [DEBUG] [server.go:1129] ["request handled successfully"] [method=/enginepb.Discovery/Heartbeat] [request="executor_id:\"server-executor-ea266d22\"  status:1  timestamp:1666597612  ttl:23000"] [response="leader:\"server-master:10240\"  addrs:\"server-master:10240\""]
[2022/10/24 07:46:52.465 +00:00] [DEBUG] [elector.go:117] [renew] [cost=32.585529ms] []

Versions of the cluster

Dataflow Engine version (run tiflow version):

master
lance6716 commented 1 year ago

/cc @amyangfei @gozssky

amyangfei commented 1 year ago

framework job status means the job status from upper caller, Running only means it should be running (not canceling, canceled, failed or finished), the business job master can be either running or being failover. business job status can only return when business job master is really running.

It is better to align or make a regulation for job status in the full workflow, from frontend, service to engine framework and business logic.