lm-sys / FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Apache License 2.0
36.92k stars 4.55k forks source link

fastchat.serve.model_worker not loading checkpoint shards and outputs stderr messages #490

Open cenguix opened 1 year ago

cenguix commented 1 year ago

Dear FastChat Developers,

I am part of a research group working with integrating Semantic Web-based Knowledge Graphs and LLMs such as Vicuna. We are working on an open-source research project, in order to submit a paper to a Semantic Web conference.

I followed all your guidelines w.r.t. setting up Vicuna weights as delta weights from Llama.

So far, I have been able to run under Windows 11 on a 64GB RAM/4GB GPU machine, the Inference with Command Line Interface section by running: python -m fastchat.serve.cli --model-path ./models/vicuna-13b --device cpu --load-8bit

It runs OK but really slow. I assume with a machine with several GPUs or a powerful single GPU can run the python scripts properly and loads the Checkpoint Shards without any problems.

My interest is in running the restful API and client SDK because I want to submit a large set of prompts and store the answers. When I run the following command:

python -m fastchat.serve.controller this is the screen that appears: fastchat1

And when I run: python -m fastchat.serve.model_worker --model-name 'vicuna-13b' --model-path ./models/vicuna-13b --device cpu --load-8bit

It doesn't load the checkpoint shards and starts outputting stderr errors. This is the screen where you can notice that on the previous command, it loaded properly the checkpoint shards but here it doesn't. Instead it starts outputting a series of stderr messages.

fastchat2

The is an error line coming from fastchat/utils line 86 def write(self, buf). I assume it must be a small error.

Any hints on how to overcome this and execute properly for submitting in batches prompts to the Vicuna model via the client python SDK API?

Any help would be highly regarded and we will include a reference to FastChat in our research paper. Thanks.

Regards, Carlos F. Enguix

ch930410 commented 1 year ago

hello, have you resolved the issue? I also have the same problem as you. command:python -m fastchat.serve.model_worker --model-path eachadea/vicuna-13b-1.1 --num-gpus 4

image

cenguix commented 1 year ago

Hi I did not solve the problem but a colleague could run the scripts in MacOSX.

Regards, Carlos

On Tue, Apr 18, 2023 at 9:35 PM 陈小浩同学 @.***> wrote:

hello, have you resolved the issue? I also have the same problem as you. command:python -m fastchat.serve.model_worker --model-path eachadea/vicuna-13b-1.1 --num-gpus 4

[image: image] https://user-images.githubusercontent.com/28522577/232952038-ee1c07ca-6d1c-4e90-b213-3308e4970db6.png

— Reply to this email directly, view it on GitHub https://github.com/lm-sys/FastChat/issues/490#issuecomment-1514050211, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEG66CYSJTXTSJIVD6L3DKLXB5FQ5ANCNFSM6AAAAAAXDJNNP4 . You are receiving this because you authored the thread.Message ID: @.***>

-- Carlos Fernando Enguix Cervera

ch930410 commented 1 year ago

I have already solved this problem. The problem comes from the model file, please update it;the current state is GPU insufficient to start this;

cenguix commented 1 year ago

Thanks for letting me know.

On Sun, Apr 23, 2023 at 8:30 PM 陈小浩同学 @.***> wrote:

I have already solved this problem. The problem comes from the model file, please update it;the current state is GPU insufficient to start this;

— Reply to this email directly, view it on GitHub https://github.com/lm-sys/FastChat/issues/490#issuecomment-1519252805, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEG66CZSF56DMEWVPCLO6VDXCXJSXANCNFSM6AAAAAAXDJNNP4 . You are receiving this because you authored the thread.Message ID: @.***>

-- Carlos Fernando Enguix Cervera

viciaky commented 1 year ago

I have already solved this problem. The problem comes from the model file, please update it;the current state is GPU insufficient to start this; @ch930410

how did u solve this problem? what should be changed in model file?

ch930410 commented 1 year ago

我已经解决了这个问题。模型文件问题,请更新;当前状态为GPU不足,无法启动; @ch930410

你是怎么解决这个问题的?模型文件中应该更改什么?

the model file is original

viciaky commented 1 year ago

It doesn't load the checkpoint shards and starts outputting stderr errors. This is the screen where you can notice that on the previous command, it loaded properly the checkpoint shards but here it doesn't. Instead it starts outputting a series of stderr messages. @cenguix

have you solved the issue?

cenguix commented 1 year ago

No a colleague could overcome this on a MAC OSX machine

El jue, 27 de abr. de 2023 11:29 p. m., viciaky @.***> escribió:

It doesn't load the checkpoint shards and starts outputting stderr errors. This is the screen where you can notice that on the previous command, it loaded properly the checkpoint shards but here it doesn't. Instead it starts outputting a series of stderr messages. @cenguix https://github.com/cenguix

have you solved the issue?

— Reply to this email directly, view it on GitHub https://github.com/lm-sys/FastChat/issues/490#issuecomment-1526974558, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEG66CZ6ZSH2RDWQMMO5CO3XDNBRPANCNFSM6AAAAAAXDJNNP4 . You are receiving this because you were mentioned.Message ID: @.***>

dng037 commented 1 year ago

This should be a RAM problem. There isn't enough RAM to load the checkpoint shards hence the process just gets killed. I managed to solve the issue by changing to a system with more RAM (128GB), but this is probably overkill. Hope this helps!

surak commented 1 year ago

@ch930410 you need to download the model again. This is not a problem of FastChat, but of a model. Let's close this one?