Currently, if a single benchmark fails, none of the results are uploaded for that entire compile command run. With this change, we always upload the results if only individual benchmarks fail. If the compilation fails or "global" venv building fails, of course, we still don't upload.
This changes the reporting for compile_all: Since uploading now almost always happens, there's no reason to report it as a special case. Instead, we note when it fails as a result of a benchmark failing, and report that some benchmarks didn't work for that run.
Testing this is really tricky, given that we don't have any tests at all that capture and check the output of compile or compile_all currently. I'm not sure it's worth the effort to clean that up as part of this specific PR.
I did manually check the following cases though:
The benchmark throws an exception
The benchmark dies (with os._exit(1))
The dependencies for a benchmark can't be installed
Currently, if a single benchmark fails, none of the results are uploaded for that entire
compile
command run. With this change, we always upload the results if only individual benchmarks fail. If the compilation fails or "global" venv building fails, of course, we still don't upload.This changes the reporting for
compile_all
: Since uploading now almost always happens, there's no reason to report it as a special case. Instead, we note when it fails as a result of a benchmark failing, and report that some benchmarks didn't work for that run.Testing this is really tricky, given that we don't have any tests at all that capture and check the output of
compile
orcompile_all
currently. I'm not sure it's worth the effort to clean that up as part of this specific PR.I did manually check the following cases though:
os._exit(1)
)Any other important use cases to test?