TraceMachina / nativelink

Bazel RBE with CAS server implementation in Rust. The free and open source cache and remote execution service, prioritizing stability and speed for the people that need it.
https://docs.nativelink.com
Apache License 2.0
236 stars 46 forks source link

LRE flaky in CI #986

Closed allada closed 1 week ago

allada commented 3 weeks ago

A lot of flakes are happening because of LRE. It appears that the way it caches or pulls images are causing issues.

example: https://github.com/TraceMachina/nativelink/actions/runs/9470928434/job/26092947739?pr=924

SchahinRohani commented 3 weeks ago

I have already noticed it also.

Right now the local nix store is getting mounted into the local dev cluster. This works only on linux. Ideally the nix store should use a standalone cache for development, which would improve not only the DX, but also the GH actions duration. This would work os independent.

But it is very strange that it does not work on the Linux LRE. I will have a look on this also.