llvm / torch-mlir

The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.
Other
1.35k stars 507 forks source link

[brainstorm] What could cause ONE e2e test to fail with packages but not locally #1458

Open silvasean opened 2 years ago

silvasean commented 2 years ago

In IREE-Torch, there is one test that segfaults recently. The segfault is on memory access at address 0 before it falls off into ud2's in printType. Any ideas about build config/etc. that could explain the failure?

https://github.com/iree-org/iree-torch/pull/51

cc @ashay @powderluv

ashay commented 2 years ago

This is just a shot in the dark.

Last night, I learnt the hard way that the VMs used by CI may have files from prior CI runs (or so it seems). Could this be a case of a non-hermetic build system like CMake being affected by prior builds?

silvasean commented 2 years ago

Thanks Ashay. In this case, I am able to reproduce locally with the installed packages, so files from prior runs are unlikely to be an issue.