Closed mds1 closed 6 months ago
So far, I've not been able to reproduce this locally. From what I can see in the logs, if it's hanging, it seems to be hanging at an execute request: I've never seen the backend hang on this type of request before. To move this forward, besides removing the big test being executed and implementing better logging, we'll shortly be opening a PR to offload the compute. Hopefully, all of this sheds more light on what might be happening here. Will keep updating here what I find!
We have not since this since I've opened, it's possible it could have been some flake with CircleCI infra. So I will optimistically close this for now, and will reopen if it surfaces again
In https://github.com/ethereum-optimism/optimism/pull/10159 we updated from kontrol version 0.1.196 to 0.1.247. We now occasionally get
Too long with no output (exceeded 10m0s): context deadline exceeded
job failures in CircleCI, such as this one.This is a CircleCI feature that causes jobs to timeout if it's been over 10m with no output, you can read more here: https://support.circleci.com/hc/en-us/articles/360045268074-Build-Fails-with-Too-long-with-no-output-exceeded-10m0s-context-deadline-exceeded.
I'm unsure if the issue here is that the job is still running as expected but just not producing output (in which case the solution might be to add more logs), or if kontrol actually hanging