The subsection running the benchmark should be more precise.
At the moment, it says "run the cli script" and then gives python3 scripts/cli.py --help. But that only gives the help text..
Should say something like "make sure that you do not get any error messages. Now, check that you can run a single game. For example, try python scripts/cli.py -m gpt-3.5-turbo run taboo. This verifies that you OpenAI key is working. You should be seeing something like ... and find a new directory .... in games/taboo...
The subsection
running the benchmark
should be more precise.At the moment, it says "run the cli script" and then gives
python3 scripts/cli.py --help
. But that only gives the help text..Should say something like "make sure that you do not get any error messages. Now, check that you can run a single game. For example, try
python scripts/cli.py -m gpt-3.5-turbo run taboo
. This verifies that you OpenAI key is working. You should be seeing something like ... and find a new directory .... ingames/taboo
...