Breaking down Helpfulness

Authors

In this repository you will find the data and code used in our work about Breaking down Helpfulness for LLMs:

data contains different splits of the dolly-databricks-15k dataset used in the study, together with our own IDs*.
annotation contains the annotated data (CSV) used for the study, mapped through our IDs, and the annotation guidelines.
code contains the R functions used for the statistical analysis and plots, the code for LLM inference, the code to calculate inter-annotator agreement, as well as some data wrangling scripts.
figures contains the R-generated PDFs with the plots, and PNG files where those figures were further edited.

*IDs mentioned in the paper do not correspond to these but to the entry number in the original dataset.