Description

This PR makes several refactoring changes to clean up the pipeline for the chatbot example. The TL;DR is basically that this makes it easier to (1) run zeno-build in parallel on multiple machines, and (2) generate reports from previously finished runs.

There are many fine details such as introducing a locking mechanism to prevent the same experiments from being run twice in parallel, automatically loading the prediction files at the end of the training run, etc.