OpenGenerativeAI / llm-colosseum

Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluate the quality of an LLM
https://huggingface.co/spaces/junior-labs/llm-colosseum
MIT License
1.31k stars 155 forks source link

Fighters not fighting #60

Open nickschuetz opened 1 month ago

nickschuetz commented 1 month ago

Despite following the instructions on how to run llm-colosseum in the README and trying make run , make local and make demo executions (still following instructions exactly) the window pops up and the players just sit there and the clock ticks to zero. Then the console defaults to Player 1 always winning despite no actual winners. What am I not doing or doing wrong?

P.S. Is there a Discord for this project?

nickschuetz commented 1 month ago

Here's the output i'm seeing in the terminal

🏟️  (b18f) (0)Starting game
🏟️  (b18f) (0)Waiting for fight to start
2024-08-07 09:37:46.525 Python[54119:2568961] WARNING: Secure coding is not enabled for restorable state! Enable secure coding by implementing NSApplicationDelegate.applicationSupportsSecureRestorableState: and returning YES.
Exception in thread Thread-5:
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
Exception in thread Thread-6:
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
    self.run()
    self.run()
  File "/Users/ken/PROJECTS/DIAMBRA/llm-colosseum/eval/game.py", line 369, in run
  File "/Users/ken/PROJECTS/DIAMBRA/llm-colosseum/eval/game.py", line 383, in run
    self.game.player_2.robot.plan()
  File "/Users/ken/PROJECTS/DIAMBRA/llm-colosseum/agent/robot.py", line 134, in plan
    self.game.player_1.robot.plan()
  File "/Users/ken/PROJECTS/DIAMBRA/llm-colosseum/agent/robot.py", line 134, in plan
    next_steps_from_llm = self.get_moves_from_llm()
                          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ken/PROJECTS/DIAMBRA/llm-colosseum/agent/robot.py", line 293, in get_moves_from_llm
    next_steps_from_llm = self.get_moves_from_llm()
    llm_stream = self.call_llm()
                          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ken/PROJECTS/DIAMBRA/llm-colosseum/agent/robot.py", line 293, in get_moves_from_llm
                 ^^^^^^^^^^^^^^^
  File "/Users/ken/PROJECTS/DIAMBRA/llm-colosseum/agent/robot.py", line 369, in call_llm
    llm_stream = self.call_llm()
                 ^^^^^^^^^^^^^^^
  File "/Users/ken/PROJECTS/DIAMBRA/llm-colosseum/agent/robot.py", line 369, in call_llm
    resp = client.stream_chat(messages)
           ^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'stream_chat'
    resp = client.stream_chat(messages)
           ^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'stream_chat'
🏟️  (b18f) (0)Round won by P1
(0)Moving to next round
 Player1 mistral:mistral-small-latest 'Baby' won!
🏟️  (b18f) Closing console
🏟️  (b18f) (0)----------------------------------------
(0)   Thank you for using DIAMBRA Arena.
(0)   RL is the main road torwards AGI,
(0) better to be prepared... Keep rocking!
(0)                   -
🏟️  (b18f) (0)          https://diambra.ai
(0)----------------------------------------
oulianov commented 2 weeks ago

Hello !

Looks like this error was related to : https://github.com/OpenGenerativeAI/llm-colosseum/issues/62 I pushed your suggested fix : renaming ollama.py to local.py Thank you for figuring this out !

For the Discord, we hang out on the Diambra Discord : https://discord.com/invite/jDcW8z4gft My user tag is roger_jacques

nickschuetz commented 2 weeks ago

62 does fix the issue for make local. However, the issue still remains when using remote models.

oulianov commented 2 weeks ago

What is the error message you see when using remote models ? AttributeError: 'NoneType' object has no attribute 'stream_chat' ?

nickschuetz commented 2 weeks ago

What is the error message you see when using remote models ? AttributeError: 'NoneType' object has no attribute 'stream_chat' ?

Yes, that's the one.

oulianov commented 1 week ago

Okay, I understand now.

This is my fault, I merged a PR 5 months ago that refacto'd a critical part, and incidently removed mistral from the list of supported models.

I changed this on main: https://github.com/OpenGenerativeAI/llm-colosseum/pull/68

Can you please follow the following instructions:

  1. Fetch the latest version git checkout main, then git pull
  2. Create a new python virutal environment python -m venv env
  3. Activate the env source env/bin/activate (if you're on Windows, the command is different)
  4. Install requirements with make install. If this doesn't work, do pip install -r requirements.txt
  5. Try first to run with openai: make run (You need to create a .env file with an OpenAI API key, and you need Docker installed and running)
  6. Launch Ollama locally (ollama), install the mistral model (ollama run mistral), then run make local to launch local version

Tried all of this on my machine and it worked fine. So I'm not sure which step actually fails for you. As you see, there is a lot of setup and Python gotchas.

Please tell me if it works now. If it still fails, please tell me at which step.

Feel free to book a call with me here: https://cal.com/nicolas-oulianov We'll get to the bottom of this!!!!!