JBGruber / opinion-wg2

3 stars 2 forks source link

Annotate full papers with generative large language models #20

Open JBGruber opened 1 month ago

JBGruber commented 1 month ago

This has yielded excellent results in our coding in Limassol, where we did this with ChatGPT. Essentially, we simply ask a model to answer the questions we ask coders (outlined here). ChatGPT has a way to handle PDFs on the web interface (maybe also on the API). Ideally, we would try to do this with an open source model like llama3 (open-webui also supports PDF uploads, not sure how they handle it). But given the length of papers and the fact that we might not be too concerend with reproducibility in this case, GPT-4o might make most sense.

JBGruber commented 2 weeks ago

Salamanca task description

I think we essentially just need a RAG system. I've started a Quarto notebook here: https://github.com/JBGruber/opinion-wg2/blob/gllm-annotation/paper-annotation-gllm/llm-annotation.qmd

The notebook contains code that downloads the validation data already. I think it's unlikely we can validate this automatically. Probably it makes more sense to check manually whether each answer is correct or not and note it down. (obviously we need to set a seed to reproduce the answers)

You can find the Codebook here: https://docs.google.com/document/d/185Q1IuJ0ebIFEb1BepMxkbzt23XrYX_QOJCUGqrKdbw/edit#heading=h.hkgzwqhr5jie

Or in this Notebook that set up the task (it also contains the variable names in the annotated data): https://github.com/JBGruber/opinion-wg2/blob/gllm-annotation/paper-annotation/3._wg2_full_paper_annotation.qmd

You should work in the "gllm-annotation" Github branch for this. You don't have to use my approach or the notebook though (also I much prefer it over Jupyter and you can use it with Python). And if you have a better idea, let me know.

I would prefer that this is done via an open model, like llama3. But if this does not work well enough, I think we should consider OpenAI. Here are some thoughts:

atomashevic commented 2 weeks ago
brunoyun commented 2 weeks ago

The question in Q31_mult is the same as Q32. I believe this is a typo

In https://github.com/JBGruber/opinion-wg2/blob/gllm-annotation/paper-annotation/3._wg2_full_paper_annotation.qmd