Web interface for information-asymmetric debates.
This code was used to run the experiments in the following paper:
Debate Helps Supervise Unreliable Experts. Julian Michael, Salsabila Mahdi, David Rein,* Jackson Petty, Julien Dirani, Vishakh Padmakumar, and Samuel R. Bowman.
See the 2023-nyu-experiments branch for full details and data.
For the analytics server, you'll also need Python 3. I recommend using it in a virtual environment. To get set up, run:
python -m venv env
source env/bin/activate
pip install -r requirements.txt
For development, run
mill -j 0 debate.dev.serve
in the base directory of this repository. (-j 0
will parallelize and speed up compilation.)
You can also pass in flags at runtime:
--port
: the port to host the server at. (default: 8080)--analytics-port
: the port the analytics server will be hosted at. (default: 8081)--save
: the directory to save the server state at. (default: save
)--help
: print command info instead of running the server.To run HTTPS, there is also an --ssl
flag which has the server look for a keystore.jks
and
password
under debate/resources
, but I normally run behind a proxy which takes care of this.
The difference between development and production is that production mode uses fully-optimized JS
compilation, which takes longer but produces a much smaller and faster-running JS file.
To run unit tests, use mill debate.jvm.test
. (JS tests aren't working at the moment; see #76.)
To run debates with GPT-4 as a participant, you will need to have the packages in model-debate/requirements.txt
installed. You will also need a SECRETS
file in the model-debate
directory with the following format (without braces):
NYU_ORG={our NYU OPENAI API organization ID}
API_KEY={your OPENAI API key)
To start the server (which formats and processes POST requests sent from the main Scala webapp, before sending the debate transcript to GPT-4), navigate to the subdirectory model-debate
and run
uvicorn app:app
You can also pass in the following flags:
---port
: the port to host the server at.---reload
: if you're doing development on the server, and want automatic reloading.There's a python server to produce the visualizations in the Analytics tab of the interface. After activating your virtual environment and installing dependencies as described above, run To start this, run:
FLASK_APP=vis/server.py python -m flask run --port 8081
and the analytics pane should work in the debate webapp assuming you ran that also without changing
any command line arguments). The --port
argument (default: 5000) should match the --analytics-port
argument
from the main webapp. If you are using a save directory other than save
, you can pass it in
using the DATA_DIR
environment variable, e.g.:
FLASK_APP=vis/server.py DATA_DIR=scratch/save-server python -m flask run --port 8081
The value you pass in for DATA_DIR
should match that of --save
for the main webapp.
build.sc
: Mill build file.debate/src{,-jvm,-js}
: Scala source for all platforms.debate/test/src
: Tests.vis/
: Python code for the analytics visualization server.scripts/
: Some python scripts for working with QuALITY stories.JVM entry point (debate webapp server):
JS entry point (debate webapp client):
Python entry point (analytics visualization server):
After starting up the server, go to the page and open the Admin tab. There you can add/remove debater profiles, create debates, etc.
If you change the JS source only, then you can run mill debate.js.fastestOpt
and hard refresh the
page when it's done to load the changes. If you change the JVM or shared source as well, then
you'll need to restart the server (i.e., interrupt and re-run mill debate.dev.serve
).
bloop
as the build server.scalafix
somewhat-often. Check out e.g. mill debate.jvm.fix
.Jess Smith did the following to speed up compilation and linking. This was, in his view, surprisingly cheap for the productivity gains. (Julian has not had problems with Metals in VSCode on his M2 Air, but fastestOpt
sometimes takes ~10s, and YMMV.)
localhost:8080
to connect to server:8080
).The code in this repository is written in Scala in functional style.
On Scala and FP:
Relevant libraries to reference: