lm-sys / FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Apache License 2.0
36.63k stars 4.52k forks source link

FastChat API completion error #2749

Closed JanMarkD closed 10 months ago

JanMarkD commented 10 months ago

I tried to run the FastChat API on my MacBook Pro M1 with the following steps:

python3 -m fastchat.serve.controller python3 -m fastchat.serve.model_worker --model-path lmsys/vicuna-7b-v1.5 --device mps python3 -m fastchat.serve.openai_api_server --host localhost --port 8000

When I try to use the API with this curl request:

curl http://localhost:8000/v1/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "vicuna-7b-v1.5",
    "prompt": "Once upon a time",
    "max_tokens": 41,
    "temperature": 0.5
  }'

It gives me this error:

2023-11-28 22:29:34 | ERROR | stderr | File "/Users/jmdannenberg/Desktop/TU Delft/Year 6/Semester 1/Code/posts-analysis.nosync/venv/lib/python3.9/site-packages/fastchat/model/monkey_patch_non_inplace.py", line 22, in apply_rotary_pos_emb 2023-11-28 22:29:34 | ERROR | stderr | gather_indices = gather_indices.repeat(1, cos.shape[1], 1, cos.shape[3]) 2023-11-28 22:29:34 | ERROR | stderr | IndexError: tuple index out of range

Does anyone have any clue what I'm doing wrong or what is missing?

infwinston commented 10 months ago

related issue: https://github.com/lm-sys/FastChat/issues/2694

This issue has been fixed in master. we'll soon release it to pypi @merrymercy