OpenGenerativeAI / llm-colosseum

Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluate the quality of an LLM
https://huggingface.co/spaces/junior-labs/llm-colosseum
MIT License
1.34k stars 160 forks source link

QWEN 1.8B or less is OK; 4B or more not working #33

Open taozhiyuai opened 8 months ago

taozhiyuai commented 8 months ago

no action generated, I think maybe token/per second is too low to run the game.

🏟️ (6e3b) (0)---------------------------------------- (0) Thank you for using DIAMBRA Arena. (0) RL is the main road torwards AGI, (0) better to be prepared... Keep rocking! (0) -
(0) https://diambra.ai (0)---------------------------------------- Player 1 using: ollama:qwen:1.8b-chat-v1.5-fp16 Player 2 using: ollama:qwen:4b-chat-v1.5-fp16 INFO:diambra.arena.engine.interface:Trying to connect to DIAMBRA Engine server (timeout=600s)... INFO:diambra.arena.engine.interface:... done. 🏟️ (6e3b) (0)Overwriting sys_settings provided by the user with command line / CLI ones (0)Provided by the user:

(0)Set via command line / CLI: emu_pipes_path: "/tmp/DIAMBRA/" roms_path: "/opt/diambraArena/roms/" binary_path: "/opt/diambraArena/mame/" lock_fps: true username: "10545863"


  .:-:-**#*#####+***+=-:. 

..-++####+#################=+=:. :+############################-. .-+#############################=. .-###########################+. .=######++======++##########=. ........ ... ....... ........... ........ ........ ............ ............ ........... -=------:---------=########- .------..-----:. .------. .-----------. --------. :-------- --------------:. --------------:. :----------: .:--------:---------:=####### .------..-------..------. :-----:-----:. ---------. .--------- ------:..:-----: ------:..------: .------:-----. .:------::---::--------:+#######..------. .------..------. .------.:-----. ----------.---------- ------:..------. ------: :-----. .:-----:.------. :-----::.:---:-----::---:###### .------. .------..------. .:-----: .------. ------:-------------- ------::-----:.. ------:.-----:. .------..------. .--:.. . :::.:.---.-:---.#####- .------. .------..------. .------.::------: ------:.-----.:------ ------: :-----: ------:.:------: .------:.:-------. ..:..:::.:::.::....:.::--:####. .------::------:..------..------:.:::------. ------: .---. :-----: -------::------: ------: ------: :------..::------: .-::::::---...:--. .-.:-::=###+. .::::::::::::.. .::::::..::::::. .::::::. ::::::. .:. ::::::: ::::::::::::::. ::::::: ::::::: ::::::. .:::::: .--:. .:--...::. ..::+####: :--:.. .---:...:. .:++:. .:---:::----.. .... .:. .:------:..:... :. ....... ..:. ..

                                                               DIAMBRAβ„’ | Dueling AI Arena
                                                          https://diambra.ai - info@diambra.ai

                               Usage of this software is subject to our Terms of Use described at https://diambra.ai/terms

                                                           DIAMBRA, Inc. Β© Copyright 2018-2024

(0)Environment initialization ... 🏟️ (6e3b) SHA256 check ok. Correct rom file found. 🏟️ (6e3b) Completed console init 🏟️ (6e3b) Fontconfig error: Cannot load default config file 🏟️ (6e3b) Warning: -video none doesn't make much sense without -seconds_to_run 🏟️ (6e3b) ALSA lib conf.c:4553:(snd_config_update_r) Cannot access file /usr/share/alsa/alsa.conf ALSA lib seq.c:935:(snd_seq_open_noupdate) Unknown SEQ default 🏟️ (6e3b) Unable to create history.db Unable to create history.db Unable to create history.db INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" 🏟️ (6e3b) Registering screen ... done. 🏟️ (6e3b) Registering audio ... done. INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" Player 1 move: low punch Player 1 move: super attack 2 Player 1 move: super attack 3 Player 1 move: super attack 4 Player 1 move: medium punch Player 1 move: low punch Player 1 move: low punch 2024-03-30 21:01:37.417 | WARNING | agent.robot:get_moves_from_llm:317 - Many invalid moves: ['High Attack', 'Low Attack', 'High Attack'] 🏟️ (6e3b) Registering program ... done. 🏟️ (6e3b) Num. of Channels = 4 Screen Dim (W x H) = 384 224 🏟️ (6e3b) (Recorder) Frame encoding enabled. (Recorder) Compression quality: 95 (0)Buttons configuration: (0) LP = But4 (0) HP = But5 (0) MP = But1 (0) LK = But3 (0) MK = But2 (0) HK = But6 (0)Native frame shape = [224 X 384 X 4] (0)User defined frame_shape = [0 X 0 X 0] Resize flag = 0 Grayscale flag = 0 🏟️ (6e3b) (0)Move to start screen INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" 🏟️ (6e3b) (0)Adjust game settings INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" 🏟️ (6e3b) (0)done. INFO:diambra.arena.arena_gym:EnvironmentSettingsMultiAgent(game_id='sfiii3n', frame_shape=[0, 0, 0], step_ratio=6, disable_keyboard=True, disable_joystick=True, render_mode='human', splash_screen=False, rank=0, env_address='127.0.0.1:55125', grpc_timeout=600, seed=1711803694, difficulty=None, continue_game=0.0, show_final=False, tower=3, _last_seed=1711803694, pb_model=game_id: "sfiii3n" frame_shape { } step_ratio: 6 n_players: 2 disable_keyboard: true disable_joystick: true action_spaces: DISCRETE action_spaces: DISCRETE episode_settings { } , n_players=2, action_space=(1, 1), role=(None, None), characters=['Ken', 'Ken'], outfits=[1, 3], super_art=[3, 3], fighting_style=(None, None), ultimate_style=(None, None)) INFO:diambra.arena.arena_gym:Recording trajectories in "/Users/taozhiyu/Downloads/llm-colosseum/eval/diambra/episode_recording/sfiii3n/-/20240330210134" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" 🏟️ (6e3b) (0)WARNING: only one outfit selected by agent_0 while using 2P mode. Note that if a character faces himself, it will use the next available outfit. (0)2P Environment (0)Generic episode settings --- (0)Random seed: 42 (0)agent_0 Role: P1 (0)agent_0 Character(s): [Ken] (0)agent_0 Number of outfits: 1 (0)agent_1 Role: P2 (0)agent_1 Character(s): [Ken] (0)agent_1 Number of outfits: 1 (0)--- (0)Game-specific episode settings --- (0)agent_0 Super art: 3 (0)agent_1 Super art: 3 (0)--- (0)Restarting system and (optionally) setting difficulty INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" 🏟️ (6e3b) (0)Starting game INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" 🏟️ (6e3b) (0)Waiting for fight to start INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" Player 1 move: low punch Player 1 move: high kick Player 1 move: medium punch Player 1 move: super attack 2 Player 1 move: super attack 3 Player 1 move: super attack 4 Player 1 move: jump closer 2024-03-30 21:02:42.124 | WARNING | agent.robot:get_moves_from_llm:317 - Many invalid moves: ['Super Attack 1', 'Super attack 5', 'Super attack'] INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" Player 2 move: jump closer Player 2 move: low punch Player 2 move: high punch Player 2 move: medium punch INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" Player 1 move: fireball Player 1 move: low kick Player 1 move: high punch INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" Player 1 move: hurricane Player 1 move: megafireball Player 1 move: super attack 2 Player 1 move: super attack 3 Player 1 move: super attack 4 Player 1 move: low punch INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" Player 1 move: low kick Player 1 move: high punch Player 1 move: medium punch Player 1 move: super attack 2 Player 1 move: super attack 3 Player 1 move: super attack 4 Player 1 move: low punch Player 1 move: low punch INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" Player 1 move: super attack 2 Player 1 move: high punch Player 1 move: low punch Player 1 move: medium punch Player 1 move: high punch Player 1 move: jump closer Player 1 move: jump away INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" Player 1 move: hurricane Player 1 move: super attack 2 Player 1 move: jump closer Player 1 move: low punch INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" Player 1 move: low punch Player 1 move: super attack 2 Player 1 move: high kick Player 1 move: megafireball Player 1 move: super attack 3 INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" Player 1 move: super attack 3 Player 1 move: super attack 4 Player 1 move: low punch Player 1 move: hurricane Player 1 move: megafireball Player 1 move: medium punch Player 1 move: jump closer INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" Player 1 move: super attack 4 2024-03-30 21:04:08.569 | WARNING | agent.robot:get_moves_from_llm:317 - Many invalid moves: ['Low Strike 1 ', 'Low Strike 2 '] 🏟️ (6e3b) (0)Round won by P1 (0)Moving to next round INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" Player1 ollama:qwen:1.8b-chat-v1.5-fp16 'Baby' won! 🏟️ (6e3b) Closing console 🏟️ (6e3b) (0)---------------------------------------- (0) Thank you for using DIAMBRA Arena. (0) RL is the main road torwards AGI, (0) better to be prepared... Keep rocking! (0) -
(0) https://diambra.ai (0)---------------------------------------- Player 1 using: ollama:qwen:1.8b-chat-v1.5-fp16 Player 2 using: ollama:qwen:4b-chat-v1.5-fp16 INFO:diambra.arena.engine.interface:Trying to connect to DIAMBRA Engine server (timeout=600s)... INFO:diambra.arena.engine.interface:... done. 🏟️ (6e3b) (0)Overwriting sys_settings provided by the user with command line / CLI ones (0)Provided by the user:

(0)Set via command line / CLI: emu_pipes_path: "/tmp/DIAMBRA/" roms_path: "/opt/diambraArena/roms/" binary_path: "/opt/diambraArena/mame/" lock_fps: true username: "10545863"


  .:-:-**#*#####+***+=-:. 

..-++####+#################=+=:. :+############################-. .-+#############################=. .-###########################+. .=######++======++##########=. ........ ... ....... ........... ........ ........ ............ ............ ........... -=------:---------=########- .------..-----:. .------. .-----------. --------. :-------- --------------:. --------------:. :----------: .:--------:---------:=####### .------..-------..------. :-----:-----:. ---------. .--------- ------:..:-----: ------:..------: .------:-----. .:------::---::--------:+#######..------. .------..------. .------.:-----. ----------.---------- ------:..------. ------: :-----. .:-----:.------. :-----::.:---:-----::---:###### .------. .------..------. .:-----: .------. ------:-------------- ------::-----:.. ------:.-----:. .------..------. .--:.. . :::.:.---.-:---.#####- .------. .------..------. .------.::------: ------:.-----.:------ ------: :-----: ------:.:------: .------:.:-------. ..:..:::.:::.::....:.::--:####. .------::------:..------..------:.:::------. ------: .---. :-----: -------::------: ------: ------: :------..::------: .-::::::---...:--. .-.:-::=###+. .::::::::::::.. .::::::..::::::. .::::::. ::::::. .:. ::::::: ::::::::::::::. ::::::: ::::::: ::::::. .:::::: .--:. .:--...::. ..::+####: :--:.. .---:...:. .:++:. .:---:::----.. .... .:. .:------:..:... :. ....... ..:. ..

                                                               DIAMBRAβ„’ | Dueling AI Arena
                                                          https://diambra.ai - info@diambra.ai

                               Usage of this software is subject to our Terms of Use described at https://diambra.ai/terms

                                                           DIAMBRA, Inc. Β© Copyright 2018-2024

(0)Environment initialization ... 🏟️ (6e3b) SHA256 check ok. Correct rom file found. 🏟️ (6e3b) Completed console init 🏟️ (6e3b) Fontconfig error: Cannot load default config file 🏟️ (6e3b) Warning: -video none doesn't make much sense without -seconds_to_run 🏟️ (6e3b) ALSA lib conf.c:4553:(snd_config_update_r) Cannot access file /usr/share/alsa/alsa.conf ALSA lib seq.c:935:(snd_seq_open_noupdate) Unknown SEQ default 🏟️ (6e3b) Unable to create history.db 🏟️ (6e3b) Unable to create history.db Unable to create history.db INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" Player 1 move: low punch Player 1 move: high punch Player 1 move: super attack 2 🏟️ (6e3b) Registering screen ... done. 🏟️ (6e3b) Registering audio ... done. 🏟️ (6e3b) Registering program ... done. 🏟️ (6e3b) Num. of Channels = 4 Screen Dim (W x H) = 384 224 🏟️ (6e3b) (Recorder) Frame encoding enabled. (Recorder) Compression quality: 95 (0)Buttons configuration: (0) HK = But6 (0) HP = But5 (0) MP = But1 (0) LP = But4 (0) LK = But3 (0) MK = But2 (0)Native frame shape = [224 X 384 X 4] (0)User defined frame_shape = [0 X 0 X 0] Resize flag = 0 Grayscale flag = 0 🏟️ (6e3b) (0)Move to start screen INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" 🏟️ (6e3b) (0)Adjust game settings INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" 🏟️ (6e3b) (0)done. INFO:diambra.arena.arena_gym:EnvironmentSettingsMultiAgent(game_id='sfiii3n', frame_shape=[0, 0, 0], step_ratio=6, disable_keyboard=True, disable_joystick=True, render_mode='human', splash_screen=False, rank=0, env_address='127.0.0.1:55125', grpc_timeout=600, seed=1711803860, difficulty=None, continue_game=0.0, show_final=False, tower=3, _last_seed=1711803860, pb_model=game_id: "sfiii3n" frame_shape { } step_ratio: 6 n_players: 2 disable_keyboard: true disable_joystick: true action_spaces: DISCRETE action_spaces: DISCRETE episode_settings { } , n_players=2, action_space=(1, 1), role=(None, None), characters=['Ken', 'Ken'], outfits=[1, 3], super_art=[3, 3], fighting_style=(None, None), ultimate_style=(None, None)) INFO:diambra.arena.arena_gym:Recording trajectories in "/Users/taozhiyu/Downloads/llm-colosseum/eval/diambra/episode_recording/sfiii3n/-/20240330210420" πŸ–₯️ Received signal, terminating Traceback (most recent call last): File "/Users/taozhiyu/Downloads/llm-colosseum/ollama.py", line 32, in main() File "/Users/taozhiyu/Downloads/llm-colosseum/ollama.py", line 14, in main game = Game( File "/Users/taozhiyu/Downloads/llm-colosseum/eval/game.py", line 175, in init self.observation, self.info = self.env.reset(seed=self.seed) File "/Users/taozhiyu/miniconda3/envs/streetfighter/lib/python3.9/site-packages/diambra/arena/wrappers/episode_recording.py", line 36, in reset obs, info = self.env.reset(**kwargs) File "/Users/taozhiyu/miniconda3/envs/streetfighter/lib/python3.9/site-packages/diambra/arena/arena_gym.py", line 126, in reset response = self.arena_engine.reset(request.episode_settings) File "/Users/taozhiyu/miniconda3/envs/streetfighter/lib/python3.9/site-packages/diambra/arena/engine/interface.py", line 39, in reset return self.client.Reset(episode_settings) File "/Users/taozhiyu/miniconda3/envs/streetfighter/lib/python3.9/site-packages/grpc/_channel.py", line 1173, in call ) = self._blocking( File "/Users/taozhiyu/miniconda3/envs/streetfighter/lib/python3.9/site-packages/grpc/_channel.py", line 1157, in _blocking event = call.next_event() File "src/python/grpcio/grpc/_cython/_cygrpc/channel.pyx.pxi", line 367, in grpc._cython.cygrpc.SegregatedCall.next_event File "src/python/grpcio/grpc/_cython/_cygrpc/channel.pyx.pxi", line 190, in grpc._cython.cygrpc._next_call_event File "/Users/taozhiyu/miniconda3/envs/streetfighter/lib/python3.9/threading.py", line 256, in enter def enter(self): KeyboardInterrupt Traceback (most recent call last): File "/Users/taozhiyu/miniconda3/envs/streetfighter/bin/diambra", line 5, in from diambra import main File "/Users/taozhiyu/miniconda3/envs/streetfighter/lib/python3.9/site-packages/diambra/main.py", line 2, in sys.exit(subprocess.call([ File "/Users/taozhiyu/miniconda3/envs/streetfighter/lib/python3.9/subprocess.py", line 351, in call return p.wait(timeout=timeout) File "/Users/taozhiyu/miniconda3/envs/streetfighter/lib/python3.9/subprocess.py", line 1189, in wait return self._wait(timeout=timeout) File "/Users/taozhiyu/miniconda3/envs/streetfighter/lib/python3.9/subprocess.py", line 1933, in _wait (pid, sts) = self._try_wait(0) File "/Users/taozhiyu/miniconda3/envs/streetfighter/lib/python3.9/subprocess.py", line 1891, in _try_wait (pid, sts) = os.waitpid(self.pid, wait_flags) KeyboardInterrupt make: *** [local] Interrupt: 2

(streetfighter) taozhiyu@192 llm-colosseum % ollama list NAME ID SIZE MODIFIED
qwen:0.5b-chat-v1.5-fp16 967f7a3593ba 1.2 GB 11 hours ago
qwen:1.8b-chat-v1.5-fp16 e3562f7740ef 3.7 GB 7 hours ago
qwen:14b-chat-v1.5-fp16 cb20f077361d 28 GB 4 days ago
qwen:4b-chat-v1.5-fp16 86621ca225c4 7.9 GB 7 hours ago
qwen:7b-chat-v1.5-fp16 39e2b1482d7d 15 GB 7 hours ago
(streetfighter) taozhiyu@192 llm-colosseum % make local diambra -r ~/.diambra/roms run -l python3 ollama.py πŸ–₯️ Starting DIAMBRA environment: πŸ–₯️ logged in v2.2: Pulling from diambra/engine Digest: sha256:6b5df5c9522553a4505bf0b6b0f837dd2f945de1b0b6390d83fdf24e317de643 Status: Image is up to date for diambra/engine:v2.2 Stored credentials found. Authorization granted. Server listening on 0.0.0.0:50051 πŸ–₯️ DIAMBRA environment started Player 1 using: ollama:qwen:4b-chat-v1.5-fp16 Player 2 using: ollama:qwen:4b-chat-v1.5-fp16 INFO:diambra.arena.engine.interface:Trying to connect to DIAMBRA Engine server (timeout=600s)... INFO:diambra.arena.engine.interface:... done. 🏟️ (6c62) (0)Overwriting sys_settings provided by the user with command line / CLI ones (0)Provided by the user: 🏟️ (6c62) (0)Set via command line / CLI: emu_pipes_path: "/tmp/DIAMBRA/" roms_path: "/opt/diambraArena/roms/" binary_path: "/opt/diambraArena/mame/" lock_fps: true username: "10545863"


  .:-:-**#*#####+***+=-:. 

..-++####+#################=+=:. :+############################-. .-+#############################=. .-###########################+. .=######++======++##########=. ........ ... ....... ........... ........ ........ ............ ............ ........... -=------:---------=########- .------..-----:. .------. .-----------. --------. :-------- --------------:. --------------:. :----------: .:--------:---------:=####### .------..-------..------. :-----:-----:. ---------. .--------- ------:..:-----: ------:..------: .------:-----. .:------::---::--------:+#######..------. .------..------. .------.:-----. ----------.---------- ------:..------. ------: :-----. .:-----:.------. :-----::.:---:-----::---:###### .------. .------..------. .:-----: .------. ------:-------------- ------::-----:.. ------:.-----:. .------..------. .--:.. . :::.:.---.-:---.#####- .------. .------..------. .------.::------: ------:.-----.:------ ------: :-----: ------:.:------: .------:.:-------. ..:..:::.:::.::....:.::--:####. .------::------:..------..------:.:::------. ------: .---. :-----: -------::------: ------: ------: :------..::------: .-::::::---...:--. .-.:-::=###+. .::::::::::::.. .::::::..::::::. .::::::. ::::::. .:. ::::::: ::::::::::::::. ::::::: ::::::: ::::::. .:::::: .--:. .:--...::. ..::+####: :--:.. .---:...:. .:++:. .:---:::----.. .... .:. .:------:..:... :. ....... ..:. ..

                                                               DIAMBRAβ„’ | Dueling AI Arena
                                                          https://diambra.ai - info@diambra.ai

                               Usage of this software is subject to our Terms of Use described at https://diambra.ai/terms

                                                           DIAMBRA, Inc. Β© Copyright 2018-2024

(0)Environment initialization ... 🏟️ (6c62) SHA256 check ok. Correct rom file found. 🏟️ (6c62) Completed console init 🏟️ (6c62) Fontconfig error: Cannot load default config file 🏟️ (6c62) Warning: -video none doesn't make much sense without -seconds_to_run 🏟️ (6c62) ALSA lib conf.c:4553:(snd_config_update_r) Cannot access file /usr/share/alsa/alsa.conf ALSA lib seq.c:935:(snd_seq_open_noupdate) Unknown SEQ default 🏟️ (6c62) Unable to create history.db Unable to create history.db 🏟️ (6c62) Unable to create history.db 🏟️ (6c62) Registering screen ... done. 🏟️ (6c62) Registering audio ... done. 🏟️ (6c62) Registering program ... done. 🏟️ (6c62) Num. of Channels = 4 Screen Dim (W x H) = 384 224 🏟️ (6c62) (Recorder) Frame encoding enabled. (Recorder) Compression quality: 95 (0)Buttons configuration: (0) LP = But4 (0) HK = But6 (0) MK = But2 (0) HP = But5 (0) MP = But1 (0) LK = But3 🏟️ (6c62) (0)Native frame shape = [224 X 384 X 4] (0)User defined frame_shape = [0 X 0 X 0] Resize flag = 0 Grayscale flag = 0 🏟️ (6c62) (0)Move to start screen 🏟️ (6c62) (0)Adjust game settings 🏟️ (6c62) (0)done. INFO:diambra.arena.arena_gym:EnvironmentSettingsMultiAgent(game_id='sfiii3n', frame_shape=[0, 0, 0], step_ratio=6, disable_keyboard=True, disable_joystick=True, render_mode='human', splash_screen=False, rank=0, env_address='127.0.0.1:55126', grpc_timeout=600, seed=1711803921, difficulty=None, continue_game=0.0, show_final=False, tower=3, _last_seed=1711803921, pb_model=game_id: "sfiii3n" frame_shape { } step_ratio: 6 n_players: 2 disable_keyboard: true disable_joystick: true action_spaces: DISCRETE action_spaces: DISCRETE episode_settings { } , n_players=2, action_space=(1, 1), role=(None, None), characters=['Ken', 'Ken'], outfits=[1, 3], super_art=[3, 3], fighting_style=(None, None), ultimate_style=(None, None)) INFO:diambra.arena.arena_gym:Recording trajectories in "/Users/taozhiyu/Downloads/llm-colosseum/eval/diambra/episode_recording/sfiii3n/-/20240330210521" 🏟️ (6c62) (0)WARNING: only one outfit selected by agent_0 while using 2P mode. Note that if a character faces himself, it will use the next available outfit. (0)2P Environment (0)Generic episode settings --- (0)Random seed: 42 (0)agent_0 Role: P1 (0)agent_0 Character(s): [Ken] (0)agent_0 Number of outfits: 1 (0)agent_1 Role: P2 (0)agent_1 Character(s): [Ken] (0)agent_1 Number of outfits: 1 (0)--- (0)Game-specific episode settings --- (0)agent_0 Super art: 3 (0)agent_1 Super art: 3 (0)--- (0)Restarting system and (optionally) setting difficulty 🏟️ (6c62) (0)Starting game 🏟️ (6c62) (0)Waiting for fight to start INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" Player 2 move: jump away Player 2 move: high punch INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" Player 1 move: move away Player 1 move: low punch Player 1 move: medium punch Player 1 move: high punch INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK" πŸ–₯️ Received signal, terminating

oulianov commented 8 months ago

Yes I tried as well and I assume it's too slow ? You see in the logs that Player1 and Player2 made some moves, but it took them a long time. This benchmark is highly dependent on how you serve the models. The goal is also to be fast !

taozhiyuai commented 8 months ago

I think It take too long time to generate tokens.

taozhiyuai commented 8 months ago

14B generates only 2 actions. so I think 7B is the best model size on my laptop.

oulianov commented 8 months ago

Yeah, it's better to use dedicated GPUs or an inference platform to make the fight more interesting

grigio commented 1 month ago

+1 I'd like to see a Qwen model compared in the leaderboard