[ ] create a single validation dataset which we use across all experiments
[ ] output validation metrics to a CSV file to make it easier to draw new graphs later
[ ] write script to allow multiple experiments to be run automatically and metrics compared (e.g. one plot with all the validation errors on, one plot with all the NILM metrics over time).