Open agarwl opened 1 month ago
Hey Rishabh!! Been a long time since we chatted about NAMs/GAMs :)
Wow, this is a really cool paper -- thanks for sharing! As corroborating evidence, in medprompt we did ablations up to k=20 few shots and found continued performance improvements (e.g. 90.2 -> 90.6 on medQA when going from 5 shots to 20 shots), but wanted to keep the inference budget reasonable for the "standard" algorithm configuration. We didn't ablate beyond that, so it's really cool to see it studied so rigorously.
https://arxiv.org/pdf/2311.16452
Happy to add a link to your paper in the readme when I'm back at my desk, and excited to read it more thoroughly too. Would be fun to catch up sometime!
Seems like many-shot prompting seems to help on several of the existings tasks here (Big-bench hard, MATH, GSM8K, GPQA).
Not sure what's the process but seems like worth a mention / including it here.
https://arxiv.org/abs/2404.11018
Also, works for Claude-3 (many-shot jailbreaking paper) and gpt-4o in multimodal tasks (many-shot ICL in multimodal tasks).