reasoning-machines / pal

PaL: Program-Aided Language Models (ICML 2023)
https://reasonwithpal.com
Apache License 2.0
462 stars 58 forks source link

majority@k flags to gsm_eval #5

Closed urialon closed 1 year ago

urialon commented 1 year ago

Adding self-consistency (majority@k).

On GSM, this improves PAL from 71.4 to 80.4 (majority@40). This makes PAL 2 points better than Minerva 540B when both models use majority@40.

image
madaan commented 1 year ago

LGTM!