p-lambda / incontext-learning

Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implicit Bayesian Inference"
https://arxiv.org/abs/2111.02080
94 stars 12 forks source link

Output randomization? #1

Open hzyjerry opened 1 year ago

hzyjerry commented 1 year ago

Hi, thanks for this wonderful effort.

I'm wondering if there is any result on reproducing the output randomization findings (Min et al) using GINC? Have you tried it, and if so, would you provide the code/instruction for doing so?

Thank you.

sangmichaelxie commented 10 months ago

Sorry for the late response! We haven't run that before, but it would be interesting. To do it, you just have to modify the list here: https://github.com/p-lambda/incontext-learning/blob/fdf346bc233fd399f2a97fdf9cc44eccc08c508a/generate_data.py#L286

The structure of the prompt list is a concatenation of prompt_length tokens + a delimiter for n_example_per_prompt + 1 times. You just have to replace every (example_index + 1) * (prompt_length + 1) - 2-th element, e.g. permute them and add them back in. Don't replace the last example's label though, since that's the test example.