Fix #214: Always upload, even when some benchmarks fail

Currently, if a single benchmark fails, none of the results are uploaded for that entire compile command run. With this change, we always upload the results if only individual benchmarks fail. If the compilation fails or "global" venv building fails, of course, we still don't upload.

This changes the reporting for compile_all: Since uploading now almost always happens, there's no reason to report it as a special case. Instead, we note when it fails as a result of a benchmark failing, and report that some benchmarks didn't work for that run.

Testing this is really tricky, given that we don't have any tests at all that capture and check the output of compile or compile_all currently. I'm not sure it's worth the effort to clean that up as part of this specific PR.

I did manually check the following cases though:

The benchmark throws an exception
The benchmark dies (with os._exit(1))
The dependencies for a benchmark can't be installed

Any other important use cases to test?

python / pyperformance

Fix #214: Always upload, even when some benchmarks fail #231