Setting the default approach doesn't work

ErykCh commented 1 month ago

Hi,

my configuration for mcts is as follows:

in docker-compose.yml

services:
  optillm:
  container_name: optillm-proxy
  image: optillm:1.0
  ports:
  - 81:8000
  restart: unless-stopped
  environment:
  - OPENAI_API_KEY=no_key
  command: --log debug --approach mcts --n 5 --simulations 3 --depth 3 --return-full-response true --base-url http://vllm:8000/v1

I would expect that if I do not specify slug in the model name or in the promt, mcts will be selected, because I have: --approach mcts

Starting logs: 2024-10-21 10:09:24,593 - INFO - Starting server with approach: mcts 2024-10-21 10:09:24,594 - INFO - Server configuration: {'approach': 'mcts', 'mcts_simulations': 2, 'mcts_exploration': 0.2, 'mcts_depth': 1, 'best_of_n': 3, 'rstar_max_depth': 3, 'rstar_num_rollouts': 5, 'rstar_c': 1.4, 'n': 5, 'base_url': 'http://vllm:8000/v1', 'optillm_api_key': '', 'return_full_response': True, 'port': 8000, 'log': 'debug', 'simulations': 3, 'exploration': 0.2, 'depth': 3}

Prompt:

2024-10-21 10:09:44,412 - INFO - Received request to /v1/chat/completions 2024-10-21 10:09:44,412 - DEBUG - Intercepted Bearer Token: my_key 2024-10-21 10:09:44,412 - DEBUG - Request data: {'model': '', 'user': '66f282757cce4320b5e6bfa1', 'stream': True, 'messages': [{'role': 'user', 'content': ''}]} 2024-10-21 10:09:44,472 - INFO - Using approach(es) ['bon'], operation SINGLE, with model

but should use mcts

codelion commented 1 month ago

Thanks for checking this out, it is fixed in https://github.com/codelion/optillm/commit/83b53419dcdd38c87240d2fbcc399f1bcc500f09

ErykCh commented 1 month ago

But now it sets it permanently.

Setting --approach mcts and adding in the prompt:

moa

causes mcts to be called

ErykCh commented 1 month ago

Even worse

removing --approach mcts and adding in the prompt:

moa

result in error:

ERROR - Error processing request: Error code: 404 - {'object': 'error', 'message': 'The model auto-XYZ does not exist.', 'type': 'NotFoundError', 'param': None, 'code': 404}

codelion commented 1 month ago

Thanks for checking again. fixed it in https://github.com/codelion/optillm/commit/94fad7846e82cd24f4603a4da7019ba242f40be3

The order of preference for the approach is as follows:

Messages (user,system)
extra_body arguments
What is passed with --approach (if --approach is passed then you cannot set the model name to control the approach, this is documented in the README).
auto which will default to bon

ErykCh commented 1 month ago

Ok I now understand doc:

Please note that the convention described above works only when the optillm server has been started with inference approach set to auto. Otherwise, the model attribute in the client request must be set with the model name only.

But now I don't understand how auto is working.

From my quick tests, it always triggers a bon. How is the logic triggered to choose the best method to choose for a given question? Because it should probably be a query to the LLM first to determine which method to choose for that question?

codelion commented 1 month ago

The auto just means that the approach has to be set by the user either in the model name or request extra body or messages. If it is still not set then it defaults to bon.

How is the logic triggered to choose the best method to choose for a given question?

This is implemented in the router plugin - https://github.com/codelion/optillm/blob/main/optillm/plugins/router_plugin.py Use the router-model-name to use it. It uses a bert-style classiifer model to route to the appropriate approach. This is the model that is used - https://huggingface.co/codelion/optillm-bert-uncased

It was trained using the data generated from Arena Auto Hard using the script - https://github.com/codelion/optillm/blob/main/scripts/gen_optillm_dataset.py The code for training is here https://github.com/codelion/optillm/blob/main/scripts/train_optillm_classifier.py

codelion / optillm

Setting the default approach doesn't work #69