This repository contains code for the paper
Active Example Selection for In-Context Learning.
Yiming Zhang, Shi Feng and Chenhao Tan
Empirical Methods in Natural Language Processing (EMNLP), 2022.
conda create -n active-example-selection python=3.9
.pip install .
(add the -e
flag if you plan to make changes).We include code for the following 5 methods for example selection that we compare with in the paper:
This command runs the 4-shot random baseline on GPT-2 medium with calibration:
python src/prompting/main.py prompting_configs/baseline-gpt2.yaml
The following command runs the 4-shot random baseline on GPT-3 ada, and make
sure the field api_key_file
points to a file with your
OpenAI API key:
python src/prompting/main.py prompting_configs/baseline-gpt3.yaml
Example selection policies are trained on random trajectories sampled offline. To generate these random trajectories (e.g., on AGNews), run
python src/rl/main.py rl_configs/random-agent.yaml
Then, to train an AGNews example selection policy, run
python src/rl/main.py rl_configs/agnews-same-task.yaml
After generating random trajectories similarly for Amazon, SST-2 and TREC, run the following to train an example selection policy on these three datasets and transferred to AGNews:
python src/rl/main.py rl_configs/agnews-transfer.yaml
@inproceedings{zhang-etal-2022-active,
title = "Active Example Selection for In-Context Learning",
author = "Zhang, Yiming and Feng, Shi and Tan, Chenhao",
booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing",
month = dec,
year = "2022",
address = "Abu Dhabi, United Arab Emirates",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.emnlp-main.622",
pages = "9134--9148",
abstract = "With a handful of demonstration examples, large-scale language models demonstrate strong capability to perform various tasks by in-context learning from these examples, without any fine-tuning. We demonstrate that in-context learning performance can be highly unstable across samples of examples, indicating the idiosyncrasies of how language models acquire information. We formulate example selection for in-context learning as a sequential decision problem, and propose a reinforcement learning algorithm for identifying generalizable policies to select demonstration examples. For GPT-2, our learned policies demonstrate strong abilities of generalizing to unseen tasks in training, with a 5.8{\%} improvement on average. Examples selected from our learned policies can even achieve a small improvement on GPT-3 Ada. However, the improvement diminishes on larger GPT-3 models, suggesting emerging capabilities of large language models.",
}