A forecasting benchmark for LLMs. Leaderboards and datasets available at https://www.forecastbench.org.
git clone --recurse-submodules <repo-url>.git
cd llm-benchmark
cp variables.example.mk variables.mk
and set the values accordinglymake setup-python-env
source .venv/bin/activate
cd directory/containing/cloud/function
eval $(cat path/to/variables.mk | xargs) python main.py
Before creating a pull request:
make lint
and fix any errors and warningsmain