In this repository you will find the data and code used in our work about Breaking down Helpfulness for LLMs:
data
contains different splits of the dolly-databricks-15k
dataset used in the study, together with our own IDs*.annotation
contains the annotated data (CSV) used for the study, mapped through our IDs, and the annotation guidelines.code
contains the R functions used for the statistical analysis and plots, the code for LLM inference, the code to calculate inter-annotator agreement, as well as some data wrangling scripts.figures
contains the R-generated PDFs with the plots, and PNG files where those figures were further edited.*IDs mentioned in the paper do not correspond to these but to the entry number in the original dataset.