parnurzeal / gorequest

GoRequest -- Simplified HTTP client ( inspired by nodejs SuperAgent )
http://parnurzeal.github.io/gorequest/
MIT License
3.44k stars 414 forks source link

dkron stops running job for no apparen reason #263

Open seanfulton opened 3 years ago

seanfulton commented 3 years ago

We are running a three-node dkron cluster with 3.1.8, Periodically we find the jobs stop running. I went through the logs back to the time of last run, and found this error starting and running continously for the last four days:

Aug 20 13:55:10 nj-dcos02-cl01 dkron: time="2021-08-20T13:55:10-04:00" level=error msg="job: Error quering for running executions" error="context deadline exceeded" node=nj-dcos02-cl01 Aug 20 13:55:10 nj-dcos02-cl01 dkron: time="2021-08-20T13:55:10-04:00" level=error msg="grpc: error dialing." error="context deadline exceeded" method=GetActiveExecutions node=nj-dcos02-cl01 server_addr

What does this mean?