Extracting Metrics

Development on the extraction of metrics. See below for an overview of changes.

Creating a shared pipeline with standard formatting

A standard metrics pipeline for all data!

Metrics pipeline has been updated to extract metrics for both human and AI datasets
Rather than alphabetising all columns, the first columns in all datasets are now id and model for a better overview. The rest of the columns (text metrics) are alphabetised.
As a result of point 1 and 2, human metrics files that were previously created have been updated so that files across human and AI have the same format. The only addition being the model column.

Metrics extraction for all data takes around 30 minutes on a 64 machine using all cores but one!

Rather than relying on wrapper function td.extract_metrics from TextDescriptives, a custom pipeline with nlp.pipe() (as also presented in Quickstart) was created in utils/get_metrics.py to enable multi-processing.
The functions that relied on td.extract_metrics in utils/get_metrics.py have not yet been phased out as they are still used around in the codebase (for prompt_select and analysis) but might be removed in the future.

Metrics have been extracted for all datasets (AI and Human), but DailyDialog needs another go, see #56.

Readme in src/metrics/README.md has been updated for instructions on how to run the code (both with a bash script and a custom way with args).