OpenGenerativeAI / llm-colosseum

Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluate the quality of an LLM
https://huggingface.co/spaces/junior-labs/llm-colosseum
MIT License
1.31k stars 156 forks source link

can I run locally? #28

Closed taozhiyuai closed 5 months ago

taozhiyuai commented 5 months ago

I plan to "ollama serve" two different models locally, how should I set .env? and base URL?

`MISTRAL_API_KEY=""

OPENAI_API_KEY="ollama"

GROK_API_KEY=""

MODEL_PROVIDER="openai" # Options: ["openai", "mistral"]

DISABLE_LLM="False"`

taozhiyuai commented 5 months ago

if I want to use models like QWEN, mistral, which is not in the env at the moment, how to set them in env file?

taozhiyuai commented 5 months ago

can I set like this? but how to set two local ollama models?

MISTRAL_API_KEY="" OPENAI_API_KEY=“ollama” OPEN_BASE_URL="http://127.0.0.1:11434/v1" GROK_API_KEY="" MODEL_PROVIDER="openai" # Options: ["openai", "mistral"] DISABLE_LLM="False"

taozhiyuai commented 5 months ago

it seems it does not support local LLMs.

oulianov commented 5 months ago

Hello!

You can use local models, however just setting environment variables will not work. Here is how to do it:

  1. Make sure you have ollama installed, running, and with a model downloaded (run ollama serve mistral in the terminal for example)

  2. Make sure you pulled the latest version from main :

    git checkout main 
    git pull
  3. In script.py, replace the main function with the following one.

from eval.game import Game, Player1, Player2

def main():
    # Environment Settings
    game = Game(
        render=True,
        player_1=Player1(
            nickname="Daddy",
            model="ollama:mistral",
        ),
        player_2=Player2(
            nickname="Baby",
            model="ollama:mistral",
        ),
    )
    return game.run()

The convention we use is model_provider:model_name. If you want to use another local model than Mistral, you can do ollama:some_other_model

  1. Run the simulation: make

That's it ! Worked on my machine.

Did this solution work for you @taozhiyuai ?

taozhiyuai commented 5 months ago

Hello!

You can use local models, however just setting environment variables will not work. Here is how to do it:

  1. Make sure you have ollama installed, running, and with a model downloaded (run ollama serve mistral in the terminal for example)
  2. Make sure you pulled the latest version from main :
git checkout main 
git pull
  1. In script.py, replace the main function with the following one.
def main():
    # Environment Settings
    game = Game(
        render=True,
        player_1=Player1(
            nickname="Daddy",
            model="ollama:mistral",
        ),
        player_2=Player2(
            nickname="Baby",
            model="ollama:mistral",
        ),
    )
    return game.run()

The convention we use is model_provider:model_name. If you want to use another local model than Mistral, you can do ollama:some_other_model

  1. Run the simulation: make

That's it ! Worked on my machine.

Did this solution work for you @taozhiyuai ?

thanks for reply. err is below

截屏2024-03-27 09 05 33
oulianov commented 5 months ago

Good start! At the top of script.py, replace the second line by the following one:

from eval.game import Game, Player1, Player2

This should do the trick.

taozhiyuai commented 5 months ago

Good start! At the top of script.py, replace the second line by the following one:

from eval.game import Game, Player1, Player2

This should do the trick.

work now. but two players do not work until time out. app exit.

截屏2024-03-27 16 09 26 截屏2024-03-27 16 09 13 截屏2024-03-27 16 09 18 截屏2024-03-27 16 09 03
oulianov commented 5 months ago

Okay! This issue is related to the fact that you have curly double quotations ” instead of straight ones " in your .env file Solution : Delete all content of the .env file (you don't need it anyways for local models) Tell me if it worked!

taozhiyuai commented 5 months ago
截屏2024-03-28 10 40 08 截屏2024-03-28 10 40 53

I have deleted .env. but ERR

taozhiyuai commented 5 months ago

model in ollama serve is shown below:

`(streetfighter) taozhiyu@TAOZHIYUs-MBP llm-colosseum % ollama list

NAME ID SIZE MODIFIED

qwen:14b-chat-v1.5-fp16 cb20f077361d 28 GB 40 hours ago `

oulianov commented 5 months ago

Progress!

Did you run git pull in the terminal to get the latest version from main ?

If not, please do so.

Alternative solution : delete the line where the error happens

taozhiyuai commented 5 months ago

Progress!

Did you run git pull in the terminal to get the latest version from main ?

If not, please do so.

Alternative solution : delete the line where the error happens

okay, forget the last clone.

  1. I start a new git clone https://github.com/OpenGenerativeAI/
  2. activate the old condo env
  3. pip install -r requirements.txt
  4. make run

ERR is below

'(streetfighter) taozhiyu@TAOZHIYUs-MBP llm-colosseum-main % make run diambra -r ~/.diambra/roms run -l python3 script.py 🖥️ Starting DIAMBRA environment: 🖥️ logged in v2.2: Pulling from diambra/engine Digest: sha256:6b5df5c9522553a4505bf0b6b0f837dd2f945de1b0b6390d83fdf24e317de643 Status: Image is up to date for diambra/engine:v2.2 Stored credentials found. Authorization granted. Server listening on 0.0.0.0:50051 🖥️ DIAMBRA environment started Player 1 using: mistral:mistral-medium-latest Traceback (most recent call last): File "/Users/taozhiyu/Downloads/llm-colosseum-main/script.py", line 30, in main() File "/Users/taozhiyu/Downloads/llm-colosseum-main/script.py", line 17, in main player_1=Player1( File "/Users/taozhiyu/Downloads/llm-colosseum-main/eval/game.py", line 71, in init self.verify_provider_name() File "/Users/taozhiyu/Downloads/llm-colosseum-main/eval/game.py", line 47, in verify_provider_name assert ( AssertionError: Mistral API key not set 🖥️ Couldn't run: exit status 1 make: *** [run] Error 1 (streetfighter) taozhiyu@TAOZHIYUs-MBP llm-colosseum-main % '

taozhiyuai commented 5 months ago

WechatIMG2

taozhiyuai commented 5 months ago

the model I use is "qwen:14b-chat-v1.5-fp16"

not mistral

taozhiyuai commented 5 months ago

I modify script.py as shown in the pictures. the game is shown. but players do not fight, until time out. I paste the whole logs below. it seems some error.

截屏2024-03-29 13 01 15

`(streetfighter) taozhiyu@TAOZHIYUs-MBP llm-colosseum-main % make run diambra -r ~/.diambra/roms run -l python3 script.py 🖥️ Starting DIAMBRA environment: 🖥️ logged in v2.2: Pulling from diambra/engine Digest: sha256:6b5df5c9522553a4505bf0b6b0f837dd2f945de1b0b6390d83fdf24e317de643 Status: Image is up to date for diambra/engine:v2.2 Stored credentials found. Authorization granted. Server listening on 0.0.0.0:50051 🖥️ DIAMBRA environment started Player 1 using: ollama:qwen:14b-chat-v1.5-fp16 Player 2 using: ollama:qwen:14b-chat-v1.5-fp16 INFO:diambra.arena.engine.interface:Trying to connect to DIAMBRA Engine server (timeout=600s)... INFO:diambra.arena.engine.interface:... done. 🏟️ (46ce) (0)Overwriting sys_settings provided by the user with command line / CLI ones (0)Provided by the user: 🏟️ (46ce) (0)Set via command line / CLI: emu_pipes_path: "/tmp/DIAMBRA/" roms_path: "/opt/diambraArena/roms/" binary_path: "/opt/diambraArena/mame/" lock_fps: true username: "10545863"


  .:-:-**#*#####+***+=-:. 

..-++####+#################=+=:. :+############################-. .-+#############################=. .-###########################+. .=######++======++##########=. ........ ... ....... ........... ........ ........ ............ ............ ........... -=------:---------=########- .------..-----:. .------. .-----------. --------. :-------- --------------:. --------------:. :----------: .:--------:---------:=####### .------..-------..------. :-----:-----:. ---------. .--------- ------:..:-----: ------:..------: .------:-----. .:------::---::--------:+#######..------. .------..------. .------.:-----. ----------.---------- ------:..------. ------: :-----. .:-----:.------. :-----::.:---:-----::---:###### .------. .------..------. .:-----: .------. ------:-------------- ------::-----:.. ------:.-----:. .------..------. .--:.. . :::.:.---.-:---.#####- .------. .------..------. .------.::------: ------:.-----.:------ ------: :-----: ------:.:------: .------:.:-------. ..:..:::.:::.::....:.::--:####. .------::------:..------..------:.:::------. ------: .---. :-----: -------::------: ------: ------: :------..::------: .-::::::---...:--. .-.:-::=###+. .::::::::::::.. .::::::..::::::. .::::::. ::::::. .:. ::::::: ::::::::::::::. ::::::: ::::::: ::::::. .:::::: .--:. .:--...::. ..::+####: :--:.. .---:...:. .:++:. .:---:::----.. .... .:. .:------:..:... :. ....... ..:. ..

                                                               DIAMBRA™ | Dueling AI Arena
                                                          https://diambra.ai - info@diambra.ai

                               Usage of this software is subject to our Terms of Use described at https://diambra.ai/terms

                                                           DIAMBRA, Inc. © Copyright 2018-2024

(0)Environment initialization ... 🏟️ (46ce) SHA256 check ok. Correct rom file found. 🏟️ (46ce) Completed console init 🏟️ (46ce) Fontconfig error: Cannot load default config file 🏟️ (46ce) Warning: -video none doesn't make much sense without -seconds_to_run 🏟️ (46ce) ALSA lib conf.c:4553:(snd_config_update_r) Cannot access file /usr/share/alsa/alsa.conf ALSA lib seq.c:935:(snd_seq_open_noupdate) Unknown SEQ default 🏟️ (46ce) Unable to create history.db 🏟️ (46ce) Unable to create history.db 🏟️ (46ce) Unable to create history.db 🏟️ (46ce) Registering screen ... done. 🏟️ (46ce) Registering audio ... done. 🏟️ (46ce) Registering program ... done. 🏟️ (46ce) Num. of Channels = 4 Screen Dim (W x H) = 384 224 🏟️ (46ce) (Recorder) Frame encoding enabled. (Recorder) Compression quality: 95 (0)Buttons configuration: (0) LP = But4 (0) LK = But3 (0) HK = But6 (0) MP = But1 (0) HP = But5 (0) MK = But2 (0)Native frame shape = [224 X 384 X 4] (0)User defined frame_shape = [0 X 0 X 0] Resize flag = 0 Grayscale flag = 0 🏟️ (46ce) (0)Move to start screen 🏟️ (46ce) (0)Adjust game settings 🏟️ (46ce) (0)done. INFO:diambra.arena.arena_gym:EnvironmentSettingsMultiAgent(game_id='sfiii3n', frame_shape=[0, 0, 0], step_ratio=6, disable_keyboard=True, disable_joystick=True, render_mode='human', splash_screen=False, rank=0, env_address='127.0.0.1:55006', grpc_timeout=600, seed=1711688374, difficulty=None, continue_game=0.0, show_final=False, tower=3, _last_seed=1711688374, pb_model=game_id: "sfiii3n" frame_shape { } step_ratio: 6 n_players: 2 disable_keyboard: true disable_joystick: true action_spaces: DISCRETE action_spaces: DISCRETE episode_settings { } , n_players=2, action_space=(1, 1), role=(None, None), characters=['Ken', 'Ken'], outfits=[1, 3], super_art=[3, 3], fighting_style=(None, None), ultimate_style=(None, None)) 🏟️ (46ce) (0)WARNING: only one outfit selected by agent_0 while using 2P mode. Note that if a character faces himself, it will use the next available outfit. (0)2P Environment (0)Generic episode settings --- (0)Random seed: 42 (0)agent_0 Role: P1 (0)agent_0 Character(s): [Ken] (0)agent_0 Number of outfits: 1 (0)agent_1 Role: P2 (0)agent_1 Character(s): [Ken] (0)agent_1 Number of outfits: 1 (0)--- (0)Game-specific episode settings --- (0)agent_0 Super art: 3 (0)agent_1 Super art: 3 (0)--- (0)Restarting system and (optionally) setting difficulty 🏟️ (46ce) (0)Starting game 🏟️ (46ce) (0)Waiting for fight to start Exception in thread Thread-5: Traceback (most recent call last): File "/Users/taozhiyu/miniconda3/envs/streetfighter/lib/python3.9/threading.py", line 980, in _bootstrap_inner Exception in thread Thread-6: Traceback (most recent call last): File "/Users/taozhiyu/miniconda3/envs/streetfighter/lib/python3.9/threading.py", line 980, in _bootstrap_inner self.run() File "/Users/taozhiyu/Downloads/llm-colosseum-main/eval/game.py", line 383, in run self.run() File "/Users/taozhiyu/Downloads/llm-colosseum-main/eval/game.py", line 369, in run self.game.player_2.robot.plan() File "/Users/taozhiyu/Downloads/llm-colosseum-main/agent/robot.py", line 133, in plan self.game.player_1.robot.plan() File "/Users/taozhiyu/Downloads/llm-colosseum-main/agent/robot.py", line 133, in plan next_steps_from_llm = self.get_moves_from_llm() File "/Users/taozhiyu/Downloads/llm-colosseum-main/agent/robot.py", line 292, in get_moves_from_llm next_steps_from_llm = self.get_moves_from_llm() File "/Users/taozhiyu/Downloads/llm-colosseum-main/agent/robot.py", line 292, in get_moves_from_llm llm_response = self.call_llm() File "/Users/taozhiyu/Downloads/llm-colosseum-main/agent/robot.py", line 334, in call_llm llm_response = self.call_llm() File "/Users/taozhiyu/Downloads/llm-colosseum-main/agent/robot.py", line 334, in call_llm client = get_sync_client(provider_name) File "/Users/taozhiyu/miniconda3/envs/streetfighter/lib/python3.9/site-packages/phospho/lab/language_models.py", line 60, in get_sync_client client = get_sync_client(provider_name) File "/Users/taozhiyu/miniconda3/envs/streetfighter/lib/python3.9/site-packages/phospho/lab/language_models.py", line 60, in get_sync_client return OpenAI(base_url="http://localhost:11434/v1/") File "/Users/taozhiyu/miniconda3/envs/streetfighter/lib/python3.9/site-packages/openai/_client.py", line 98, in init return OpenAI(base_url="http://localhost:11434/v1/") File "/Users/taozhiyu/miniconda3/envs/streetfighter/lib/python3.9/site-packages/openai/_client.py", line 98, in init raise OpenAIError( openai.OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable raise OpenAIError( openai.OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable 🏟️ (46ce) (0)Round won by P1 (0)Moving to next round Player1 ollama:qwen:14b-chat-v1.5-fp16 'Daddy' won! 🏟️ (46ce) Closing console 🏟️ (46ce) (0)---------------------------------------- (0) Thank you for using DIAMBRA Arena. (0) RL is the main road torwards AGI, (0) better to be prepared... Keep rocking! (0) -
(0) https://diambra.ai (0)---------------------------------------- (streetfighter) taozhiyu@TAOZHIYUs-MBP llm-colosseum-main % `

oulianov commented 5 months ago

Thank you for the details ! You did everything right, there was a problem in my code. I fixed it.

Here is what you should do to get the new fixed code:

  1. Do git pull to fetch the new changes in the repo. If there is an issue with that, delete the repo llm-colosseum and download it again from github.
  2. Run make install to install updated dependencies
  3. Run make local to make local models fight with the model ollama:qwen:14b-chat-v1.5-fp16 (make sure ollama is running)

To make them fight with another model, please now edit the new file ollama.py. I created it so it's easier for us to work on the same stuff.

Thank you again Tao for carrying through this! Please keep me posted.

taozhiyuai commented 5 months ago

it just suddenly exit the program. game is not over yet. I try to record the screen at the same time, I am not sure if it matters. because some time it is ok to fight and record screen.

INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" Player 1 move: super attack 4 Player 1 move: low kick INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" Player 2 move: megafireball Player 2 move: move closer INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" Player 1 move: move closer 🏟️ (1036) (0)Round won by P2 (0)Moving to next round INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" Player 2 move: megafireball Player 2 move: move closer INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" Player 1 move: jump closer Player2 ollama:qwen:14b-chat-v1.5-fp16 Daddy won! 🏟️ (1036) Closing console 🏟️ (1036) (0)---------------------------------------- (0) Thank you for using DIAMBRA Arena. (0) RL is the main road torwards AGI, (0) better to be prepared... Keep rocking! (0) -
(0) https://diambra.ai (0)---------------------------------------- (streetfighter) taozhiyu@192 llm-colosseum %

taozhiyuai commented 5 months ago

I notice the following in the logs. does it matter? (0)Environment initialization ... 🏟️ (dab4) SHA256 check ok. Correct rom file found. 🏟️ (dab4) Completed console init 🏟️ (dab4) Fontconfig error: Cannot load default config file 🏟️ (dab4) Warning: -video none doesn't make much sense without -seconds_to_run 🏟️ (dab4) ALSA lib conf.c:4553:(snd_config_update_r) Cannot access file /usr/share/alsa/alsa.conf ALSA lib seq.c:935:(snd_seq_open_noupdate) Unknown SEQ default 🏟️ (dab4) Unable to create history.db 🏟️ (dab4) Unable to create history.db 🏟️ (dab4) Unable to create history.db

@oulianov

taozhiyuai commented 5 months ago

I hope I can start 10 rounds for example with one command, and when all rounds finish, give a summary.

it is time consuming to start one round one by one.

taozhiyuai commented 5 months ago

without history and time consuming game , it is difficult.

bye the way, I notice the following info: NotImplementedError: Provider mixtral is not supported.

oulianov commented 5 months ago

Cool! We got this running.

The game did finish! If you look at the logs, at the end it's written : Player2 ollama:qwen:14b-chat-v1.5-fp16 Daddy won!

However, indeed, the game finishes just before showing the kill screen. When the player deal the last hit of damage, the emulator shuts down the game. I tried to change this, but couldn't figure out how. The (eventual) solution may only work on Linux.

All of your questions are related to Diambra. Diambra is the framework used to make AI fight in Street Fighter. I'm not a big Diambra expert sadly. I suggest you join the Diambra discord server and ask there how to display the kill screen at the end of a round! Do so here: https://discord.gg/qyfFgpwAaY

They can also help you with running multiple rounds.

taozhiyuai commented 5 months ago

without history and time consuming game , it is difficult.

oulianov commented 5 months ago

Do you think you could improve it? Contributions are welcomed!

taozhiyuai commented 5 months ago

it seems Qwen 0.5B is better model to run. by the way, how to change characters?

Alex, Twelve, Hugo, Sean, Makoto, Elena, Ibuki, Chun-Li, Dudley, Necro, Q, Oro, Urien, Remy, Ryu, Gouki, Yun, Yang, Ken

taozhiyuai commented 5 months ago

Do you think you could improve it? Contributions are welcomed!

very nice. small model with high token/second is the best experience. better gpu could load two big models

taozhiyuai commented 5 months ago

it seems the screen can not show the final beat. just freeze and exit. report the result with text in logs.

taozhiyuai commented 5 months ago

my laptop is M3 Max 128G, I think token/second is more important than the size of model parameters.

big model waiting for tokens is killed by small model who keeps fighting.

taozhiyuai commented 5 months ago
截屏2024-03-30 14 33 05

why it keep outputting this msg? by two qwen:4b-chat-v1.5-fp16 models.

INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"

taozhiyuai commented 5 months ago

I have tried qwen serial models, range from 0.5,1.8,4,7,14B only 0.5B, 1.8B, experience good. fight smoothly because of high token/second I think.

other big parameters model will suffer situation above.