openai / transformer-debugger

MIT License
4.01k stars 231 forks source link

Update explainer/scoring model to gpt-4-turbo #23

Closed machina-source closed 4 months ago

machina-source commented 4 months ago

This PR updates the model names in neuron_explainer to use the recently updated version of GPT-4, gpt-4-turbo (which currently points to gpt-4-turbo-2024-04-09)

The reasons for this change are:

  1. Better explanations/scoring: I noticed slightly improved MLP explanation scoring when using the new gpt-4-turbo-2024-04-09 model vs. the currently used gpt-4 (currently points to gpt-4-0613) and gpt-4-0125-preview models. Specifically, I observed higher confidence scores for accurate explanations and fewer instances of high scores for poor explanations.

Here is a quick model comparison test with gpt-2-small using the mlp_neuron dataset at 1:1

TDB-model-score-test-1

  1. More cost-effective: The gpt-4-turbo-2024-04-09 model is cheaper than the currently used gpt-4-0613 model (1/3 the input cost and 1/2 the output cost according to https://openai.com/pricing).

If this proposed model change would disrupt any TDB use cases, the following alternatives may be worth considering instead:

Open to all feedback and suggestions, thank you.


Model info: https://platform.openai.com/docs/models/continuous-model-upgrades