Open seanfulton opened 3 years ago
We are running a three-node dkron cluster with 3.1.8, Periodically we find the jobs stop running. I went through the logs back to the time of last run, and found this error starting and running continously for the last four days:
Aug 20 13:55:10 nj-dcos02-cl01 dkron: time="2021-08-20T13:55:10-04:00" level=error msg="job: Error quering for running executions" error="context deadline exceeded" node=nj-dcos02-cl01 Aug 20 13:55:10 nj-dcos02-cl01 dkron: time="2021-08-20T13:55:10-04:00" level=error msg="grpc: error dialing." error="context deadline exceeded" method=GetActiveExecutions node=nj-dcos02-cl01 server_addr
What does this mean?
We are running a three-node dkron cluster with 3.1.8, Periodically we find the jobs stop running. I went through the logs back to the time of last run, and found this error starting and running continously for the last four days:
Aug 20 13:55:10 nj-dcos02-cl01 dkron: time="2021-08-20T13:55:10-04:00" level=error msg="job: Error quering for running executions" error="context deadline exceeded" node=nj-dcos02-cl01 Aug 20 13:55:10 nj-dcos02-cl01 dkron: time="2021-08-20T13:55:10-04:00" level=error msg="grpc: error dialing." error="context deadline exceeded" method=GetActiveExecutions node=nj-dcos02-cl01 server_addr
What does this mean?