codelion / optillm

Optimizing inference proxy for LLMs
Apache License 2.0
1.22k stars 107 forks source link

Ambiguous configuration for mcts #68

Open ErykCh opened 2 hours ago

ErykCh commented 2 hours ago

Hi,

my configuration for mcts is as follows:

in docker-compose.yml

services: optillm: container_name: optillm-proxy image: optillm:1.0 ports:

variable names are from the first page on github but also from optillm.py

logs during optillm start

Server configuration: {'approach': 'mcts', 'mcts_simulations': 2, 'mcts_exploration': 0.2, 'mcts_depth': 1, 'best_of_n': 3, 'rstar_max_depth': 3, 'rstar_num_rollouts': 5, 'rstar_c': 1.4, 'n': 5, 'base_url': 'http://vllm:8000/v1', 'optillm_api_key': '', 'return_full_response': True, 'port': 8000, 'log': 'debug', 'simulations': 3, 'exploration': 0.2, 'depth': 3}

so mcts_simulations are 2 instead of 3 mcts_depth is 1 instead of 3

and logs when I prompt:

2024-10-21 09:48:45,396 - INFO - Starting chat with MCTS 2024-10-21 09:48:45,396 - INFO - Parameters: num_simulations=2, exploration_weight=0.2, simulation_depth=1

codelion commented 2 hours ago

Does it work when you use the environment vars for the docker container - https://github.com/codelion/optillm/blob/95cc14d1505c76fe124f56b233a864a971760893/docker-compose.yaml#L17

ErykCh commented 2 hours ago

Configuration:

services:

  optillm:
    container_name: optillm-proxy
    image: optillm:1.0
    ports:
     - 81:8000
    restart: unless-stopped
    environment:
      - OPENAI_API_KEY=no_key
      - OPTILLM_SIMULATIONS=4
      - OPTILLM_DEPTH=4
    command: --log debug --approach mcts --n 1 --return-full-response true --base-url http://vllm:8000/v1

logs:

2024-10-21 10:22:58,235 - INFO - Starting server with approach: mcts 2024-10-21 10:22:58,235 - INFO - Server configuration: {'approach': 'mcts', 'mcts_simulations': 2, 'mcts_exploration': 0.2, 'mcts_depth': 1, 'best_of_n': 3, 'model': 'gpt-4o-mini', 'rstar_max_depth': 3, 'rstar_num_rollouts': 5, 'rstar_c': 1.4, 'n': 1, 'base_url': 'http://vllm:8000/v1', 'optillm_api_key': '', 'return_full_response': True, 'port': 8000, 'log': 'debug', 'simulations': 4, 'exploration': 0.2, 'depth': 4}

2024-10-21 10:24:45,740 - DEBUG - Request data: {'model': '', 'user': '66f282757cce4320b5e6bfa1', 'stream': True, 'messages': [{'role': 'user', 'content': 'mcts\n\n'}]} 2024-10-21 10:24:45,793 - INFO - Using approach(es) ['mcts'], operation SINGLE, with model 2024-10-21 10:24:45,793 - INFO - Starting chat with MCTS 2024-10-21 10:24:45,793 - INFO - Parameters: num_simulations=2, exploration_weight=0.2, simulation_depth=1

ErykCh commented 1 hour ago

What exactly means: Simulation depth for MCTS

 1 means?

             Parent
          /            \
       Leaf         Leaf

  2 means?

                 Parent
       /       \         /     \ 
   Node        N        N       N
  /   \      /   \    /   \    /   \
L     L   L     L   L    L   L     L
codelion commented 1 hour ago

Yes that is depth, it is explained a bit here https://www.patched.codes/blog/patched-moa-optimizing-inference-for-diverse-software-development-tasks and you can see a full run for an example here - https://github.com/codelion/optillm/wiki/MCTS

codelion commented 1 hour ago

Fixed the args in https://github.com/codelion/optillm/commit/5d541ee0c95da9406011bd005621d8375dd28d5c you can try again