microsoft / promptbase

All things prompt engineering
MIT License
5.24k stars 293 forks source link

Many-shot ICL #53

Open agarwl opened 1 month ago

agarwl commented 1 month ago

Seems like many-shot prompting seems to help on several of the existings tasks here (Big-bench hard, MATH, GSM8K, GPQA).

Not sure what's the process but seems like worth a mention / including it here.

https://arxiv.org/abs/2404.11018

Also, works for Claude-3 (many-shot jailbreaking paper) and gpt-4o in multimodal tasks (many-shot ICL in multimodal tasks).

Harsha-Nori commented 1 month ago

Hey Rishabh!! Been a long time since we chatted about NAMs/GAMs :)

Wow, this is a really cool paper -- thanks for sharing! As corroborating evidence, in medprompt we did ablations up to k=20 few shots and found continued performance improvements (e.g. 90.2 -> 90.6 on medQA when going from 5 shots to 20 shots), but wanted to keep the inference budget reasonable for the "standard" algorithm configuration. We didn't ablate beyond that, so it's really cool to see it studied so rigorously.

image

https://arxiv.org/pdf/2311.16452

Happy to add a link to your paper in the readme when I'm back at my desk, and excited to read it more thoroughly too. Would be fun to catch up sometime!