project-everest / everest-ci

CI scripts for project everest
3 stars 8 forks source link

Linux agent lost communication #89

Closed irinasp closed 7 years ago

irinasp commented 7 years ago

This is not consistent issue. Repro:

  1. Start the run using build definition from VSTS, example FStar-BinaryBuild-Linux-Build 6986 Result: The run is failing after 45 minutes of running with following error: The agent Everest-CI-3 lost communication with the server. Verify the machine is running and has a healthy network connection. For more information, see: https://go.microsoft.com/fwlink/?linkid=846610
s-zanella commented 7 years ago

Saw this a couple of times. Seems to happen randomly.

irinasp commented 7 years ago

The log file is available from Docker right click menu Settings\Diagnose & Feedback\log file. I could see the error in the log file: [03:25:16.985][VpnKit ][Error ] com.docker.slirp.exe: PPP.listen callback caught Ipv4.Make(Ethif)(Arpv4).Routing.No_route_to_destinationaddress() Searching in the Docker forum I found that people complaining about this problem and the workaround is to reset docker to factory defaults: https://forums.docker.com/t/com-docker-slirp-is-using-my-containers-ports/16663/2 I will reset docker this afternoon when no runs are queued\ in progress and see if it will help with this issue.

irinasp commented 7 years ago

Reset docker to factory defaults doesn't help. I noticed that there are newer version of Ubuntu 16.04-x64 for VSTS. Updating the agent version on Linux container on build machine.

irinasp commented 7 years ago

After Linux agent version update there is no repro for this problem. I am going to watch the runs for one more day and close this issue if runs will be finishing successfully.

darrenge commented 7 years ago

I haven't seen an issue with this at all for the past week or so ... I think we can close this out.

irinasp commented 7 years ago

It seems that Linux agent version update fixed the issue. There are no connection issues.