mozilla / spidernode

Node.js on top of SpiderMonkey
https://ehsanakhgari.org/blog/2016-04-20/project-spidernode
Other
560 stars 43 forks source link

make -j2 on all Mac builds (fixes #210) #354

Closed mykmelez closed 7 years ago

mykmelez commented 7 years ago

@tbsaunde All four of the intermittent timeouts I've seen on Mac builds on Travis today have been on builds without MAKE_FLAGS=-j2, so I think we should add that flag to all the Mac builds to speed them up. This branch does so.

mykmelez commented 7 years ago

Note: For consistency, this also moves MAKE_FLAGS=-j2 to the ends of the env lines for the builds that already have that environment variable.

tbsaunde commented 7 years ago

On Mon, Jan 23, 2017 at 04:33:14PM -0800, Myk Melez wrote:

@tbsaunde All four of the intermittent timeouts I've seen on Mac builds on Travis today have been on builds without MAKE_FLAGS=-j2, so I think we should add that flag to all the Mac builds to speed them up. This branch does so.

Do you know how many cores those machines have? maybe we should use more than 2?

Anyway in principal I'm fine with this, I'll trust you on the details.

mykmelez commented 7 years ago

Do you know how many cores those machines have? maybe we should use more than 2?

I haven't been able to find any info on this, but it's worth a try. I've doubled the number of parallel make jobs to four in 56ae651.

Note that the build job that failed on Travis is probably unrelated to these changes. It was a single Mac build, and it offered no evidence that our build script ever ran, just some configuration info and then the message, "No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself."

According to https://www.traviscistatus.com/ this morning, "As part of scheduled improvements to our OS X infrastructure, we updated the worker version to 2.6.1. We’re currently rolling back to a previous worker version, due to several reports of instability."

And the job that failed reports that it was using version 2.6.1 of the worker. So I suspect that it failed because of the "instability" of the worker, not because of this change. And that'll resolve itself once Travis finishes rolling back to the older version of the worker.

brendandahl commented 7 years ago

Do you know how many cores those machines have? maybe we should use more than 2?

They have two. https://docs.travis-ci.com/user/ci-environment/

mykmelez commented 7 years ago

They have two. https://docs.travis-ci.com/user/ci-environment/

Ah, thanks! It isn't clear if hyperthreading is available. If so, then -j4 makes sense. Otherwise, however, -j2 is probably optimal. I suspect the latter, so I've reverted the bump from 2 to 4. I'll keep an eye on both Travis builds, however, to see if there's any significant difference between them.

tbsaunde commented 7 years ago

On Tue, Jan 24, 2017 at 09:17:55AM -0800, Myk Melez wrote:

Do you know how many cores those machines have? maybe we should use more
than 2?

I haven't been able to find any info on this, but it's worth a try. I've doubled the number of parallel make jobs to four in 56ae651.

Note that the build job that failed on Travis is probably unrelated to these changes. It was a single Mac build, and it offered no evidence that our build script ever ran, just some configuration info and then the message, "No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself."

According to https://www.traviscistatus.com/ this morning, "As part of scheduled improvements to our OS X infrastructure, we updated the worker version to 2.6.1. We’re currently rolling back to a previous worker version, due to several reports of instability."

And the job that failed reports that it was using version 2.6.1 of the worker. So I suspect that it failed because of the "instability" of the worker, not because of this change. And that'll resolve itself once Travis finishes rolling back to the older version of the worker.

fair enough, I was only actually asking becauseI was curious if there was some sort of reason for the choice.

mykmelez commented 7 years ago

A comparison between https://travis-ci.org/mozilla/spidernode/builds/194908331, which sets MAKE_FLAGS to -j2, and https://travis-ci.org/mozilla/spidernode/builds/194891510, which sets it to -j4, doesn't suggest an advantage for the latter (although the numbers are noisy).

So the current state of this branch, which sets all Mac builds to -j2 seems like the optimal one, and it can be merged (and perhaps squashed in the process, to remove the extraneous commits that landed and then reverted -j4).

However, Mac builds still sometimes time out, and it looks like this is unrelated to variance in build speed. I'm looking at the log for such a build now and will submit another pull request (or file an issue) once I know more.

tbsaunde commented 7 years ago

On Tue, Jan 24, 2017 at 02:22:40PM -0800, Myk Melez wrote:

A comparison between https://travis-ci.org/mozilla/spidernode/builds/194908331, which sets MAKE_FLAGS to -j2, and https://travis-ci.org/mozilla/spidernode/ builds/194891510, which sets it to -j4, doesn't suggest an advantage for the latter (although the numbers are noisy).

So the current state of this branch, which sets all Mac builds to -j2 seems like the optimal one, and it can be merged (and perhaps squashed in the process, to remove the extraneous commits that landed and then reverted -j4).

yeah, please squash and then feel free to merge.

However, Mac builds still sometimes time out, and it looks like this is unrelated to variance in build speed. I'm looking at the log for such a build now and will submit another pull request (or file an issue) once I know more.

Thanks

mykmelez commented 7 years ago

However, Mac builds still sometimes time out, and it looks like this is unrelated to variance in build speed. I'm looking at the log for such a build now and will submit another pull request (or file an issue) once I know more.

False alarm, I couldn't find anything particularly damning after examining the logs more closely (and comparing one that timed out to one that didn't). So now I think this is indeed just variance in build speed, even though the variance is really large.