LDMX-Software / docker

Docker build context for developing and running ldmx-sw.
https://ldmx-software.github.io/docker/
GNU General Public License v3.0
1 stars 2 forks source link

"context canceled" errors when loading from cache #69

Closed tomeichlersmith closed 1 year ago

tomeichlersmith commented 1 year ago

Describe the bug The CI is failing when loading from the cache if the other job completes before the cache loading is complete. It issues a cryptic "context canceled" error which, upon some googling, makes me think that the two runners sharing the same filesystem is causing one of them to "interfere" with the others cache when it completes. I had hoped that putting them in different directories would work for all situations but unfortunately it does not work for the specific case where the loading-from-cache build does not complete before the other build.

image

tomeichlersmith commented 1 year ago

Resubmitting the job works since there is nothing actually wrong with the dockerfile or the cache, but that is annoying to have to do and I'd like to avoid doing that if possible.

tomeichlersmith commented 1 year ago

I reconfigured the machine to have two separate VMs, one for each runner. This means we will have two copies of the layer cache that will advance independently of one another, but then they will not interfere with one another during the build process.

tomeichlersmith commented 1 year ago

We've done a few parallel re-builds-from-cache with this separate VM configuration and they have completed without error. For this reason, I'm going to close this and hope that it does not rear its ugly head again.