hackoregon / civic-devops

Master collection point for issues, procedures, and code to manage the HackOregon Civic platform
MIT License
11 stars 4 forks source link

Speed up Travis builds by eliminating the "sudo: required" directive in .travis.yml #167

Closed MikeTheCanuck closed 6 years ago

MikeTheCanuck commented 6 years ago

Jaron is reporting network timeouts for civic builds in Travis: https://travis-ci.org/hackoregon/civic/jobs/389552399 https://travis-ci.org/hackoregon/civic/jobs/389589641

e.g.

590.04s$ npm i -g cross-env
npm ERR! code ETIMEDOUT
npm ERR! errno ETIMEDOUT
npm ERR! network request to https://registry.npmjs.org/lru-cache/-/lru-cache-4.1.3.tgz failed, reason: connect ETIMEDOUT 104.18.98.96:443
npm ERR! network This is a problem related to network connectivity.
npm ERR! network In most cases you are behind a proxy or have bad network settings.
npm ERR! network 
npm ERR! network If you are behind a proxy, please make sure that the
npm ERR! network 'proxy' config is set properly.  See: 'npm help config'
npm ERR! A complete log of this run can be found in:
npm ERR!     /home/travis/.npm/_logs/2018-06-08T07_36_22_782Z-debug.log
The command "npm i -g cross-env" failed and exited with 1 during .

And Travis acknowledges network issues in their sudo-enabled GCE infrastructure:

Network Instability for sudo-enabled builds on travis-ci.org Investigating - We are currently investigating reports of builds on our sudo-enabled GCE infrastructure experiencing network instability and failures on travis-ci.org. Jun 8, 15:51 UTC

I've wondered since last year whether we still require sudo: required in our Travis builds (e.g. https://github.com/hackoregon/transportation-systems-backend-2018/blob/staging/.travis.yml#L1) - this is the setting which is causing our Travis builds to run in this environment rather than containerized (https://docs.travis-ci.com/user/reference/overview/).

I've never had time or need to chase this down, but maybe's the time for us to experiment with a couple of repos and see if we can avoid it? I just did a search on transportation-systems-backend-2018 and it appears the only place we use sudo there is in the start.sh script, which isn't called in the Travis environment. So I'd like to start there:

Unfortunately, this sudo: required directive doesn't appear in the civic Travis config (https://github.com/hackoregon/civic/blob/master/.travis.yml), so I'm going to guess that the Travis network issues extend beyond just the "sudo-enabled GCE infrastructure".

MikeTheCanuck commented 6 years ago

I just ran a few experiments on the transportation-systems-backend-2018 repo to compare the build speed for a job with the sudo: required directive vs a job without that directive.

It appears that at best, the build jobs take about the same time - 2:07 vs 2:10 was the closest I could get them to go.

Diving deeper into Travis' blog and some of the interwebs articles, it's unclear whether the sudo: required directive does anything anymore:

Confirm that Docker requires non-containerized Travis build environment

For funsies I eliminated the services: docker directive from one of our repos' .travis.yml to see what happens - this is the tail end of the Travis build log from that commit:

$ bin/build.sh -p
Building [secure]-service
Couldn't connect to Docker daemon at http+docker://localunixsocket - is it running?
If it's at a non-standard location, specify the URL with the DOCKER_HOST environment variable.
The command "bin/build.sh -p" exited with 1.
0.85s$ bin/test.sh -p
Couldn't connect to Docker daemon at http+docker://localunixsocket - is it running?
If it's at a non-standard location, specify the URL with the DOCKER_HOST environment variable.
The command "bin/test.sh -p" exited with 1.
Done. Your build exited with 1.

Conclusion

Looks like we're stuck in GCE non-containerized, slower build environment - at least, for any repo build that generates or uses Docker images.