AlexsLemonade / OpenPBTA-analysis

The analysis repository for the Open Pediatric Brain Tumor Atlas Project
Other
100 stars 67 forks source link

CI can fail due to docker time outs #1225

Open sjspielman opened 2 years ago

sjspielman commented 2 years ago

Recent builds are failing on Circle CI because of time outs:

Try out a larger resource class or running tests in parallel to speed up job execution. Upgrade your pricing plan to take advantage of longer max job runtimes.

context deadline exceeded

Looks like we are surpassing the three hour limit, and there is no performance degradation right now with Circle CI that could affect the run time. As more analyses get updated and pub-ready figures created, my guess is we will keep hitting this time limit moving forward.

From Circle CI docs:

Note: Jobs have a maximum runtime of 1 (Free), 3 (Performance), or 5 (Scale) hours depending on pricing plan. If your jobs are timing out, consider a larger resource class and/or parallelism. Additionally, you can upgrade your pricing plan or run some of your jobs concurrently using workflows.

sjspielman commented 2 years ago

Tagging @jaclyn-taroni @jashapiro

jaclyn-taroni commented 2 years ago

We're not upgrading to Scale, so that option's off the table.

In my opinion, the problem is that we're not seeing a benefit from the Docker layer caching right now and then 2 of 3 hours is eaten up by building the image. That might be because CI is running less frequently than earlier in the project when PRs were more frequent. It might be because how one configures that correctly has changed over the life of the project.

So some options (and these are not mutually exclusive):

jaclyn-taroni commented 2 years ago

I'm going to try altering #1224 because why not

jaclyn-taroni commented 2 years ago

Noting here that for the jobs that are rerunning after timing out Docker layer caching appears to be working as expected.