Closed asaadaldien closed 2 years ago
This is a buildkite issue, (comment from @GMNGeoffrey this should create a failed exist status), see logs: https://buildkite.com/iree/iree-android-benchmark/builds/2460#4286ae2e-7b78-46ba-9df7-48089b6c0767
So I'm very confused by these logs. The error shows up in the step where it's "preparing benchmark files" https://buildkite.com/iree/iree-android-benchmark/builds/2460#4286ae2e-7b78-46ba-9df7-48089b6c0767/45-4718, which is just copying some files around: https://github.com/google/iree/blob/51f8385c68b7e61f06c4a59971b48e72dc097427/build_tools/mako/prepare_benchmark_files.py#L30-L42
It shouldn't even be running iree-translate there. It's not called till later (not even the same script): https://github.com/google/iree/blob/51f8385c68b7e61f06c4a59971b48e72dc097427/build_tools/mako/compile_android_modules.py#L37-L43. This makes me suspicious that the logs are getting mangled somehow, but I'm not sure how. Everything looks like synchronous loops to me
I'm reproing locally.
$ docker run --user=$(id -u):$(id -g) --volume=$PWD:$IREE_DOCKER_WORKDIR --workdir=$IREE_DOCKER_WORKDIR --rm gcr.io/iree-oss/cmake-android@sha256:15d3266ae4865f7642a4ef4d76e5181f0dc3482a7cfba9021b6b55be524208ec python3 build_tools/mako/compile_android_modules.py
spews out the errors but has exit status 0 :confused:
And running that python script within the docker container interactively also has exit code 0. And so does running the iree-translate command it's failing on
build-host/iree/tools/iree-translate mobilenet-v2/iree_input.mlir --iree-mlir-to-vm-bytecode-module --iree-hal-target-backends=dylib-llvm-aot -o mobilenet-v2_Pixel4_cpu.vmfb --iree-llvm-target-triple=aarch64-none-linux-android29 --iree-flow-inline-constants-max-byte-length=2048 --iree-flow-dispatch-formation-enable-operand-fusion -iree-llvm-loop-unrolling=true
I'm guessing something is doing an emitOpError without returning it :-/
When initCpuLaunchConfig fails and returns None, we just proceed. Something needs to propagate the failure
Yup:
When initCpuLaunchConfig fails and returns None, we just proceed. Something needs to propagate the failure
Good find, Thanks Geffory! I will file a separate issue to propagate that error and fix it. However this is a separate issue a bug like that can show up in the future and we shouldn’t end up with incorrect numbers reported in mako. Lets please fix that one (e.g pipeline checks vmfb module file is actually generated)
But it was created! It's 14 MB in fact. I'm not sure there's much the pipeline can do here
Yup:
When initCpuLaunchConfig fails and returns None, we just proceed. Something needs to propagate the failure
Good find, Thanks Geffory! I will file a separate issue to propagate that error and fix it. However this is a separate issue a bug like that can show up in the future and we shouldn’t end up with incorrect numbers reported in mako. Lets please keep that one
But it was created! It's 14 MB in fact. I'm not sure there's much the pipeline can do here
You are absolutely right! This is generated (incorrect module) and we should fail and mako actually was getting the right perf number for that module.
This is stale now. Closing
To repro, you can undo https://github.com/google/iree/pull/5755
iree-translate
will fail which should reported as an (empty) slot in the perf graph.