Alpha-VLLM / LLaMA2-Accessory

An Open-source Toolkit for LLM Development
https://llama2-accessory.readthedocs.io/
Other
2.7k stars 170 forks source link

assistant respond nothing of SPHINX inference.py #104

Closed staymylove closed 9 months ago

staymylove commented 10 months ago

Dear author, When I python inference.py (SPHINX), after loading para, there is nothing the assistant respond (I wait almost 5 mins). So what the problem is it ?

image

ChrisLiu6 commented 10 months ago

Hi! You are running the script with model parallel size equal to 2, so two processes, rank0 and rank1, will be spawned. I guess that your rank1 had failed due to some reason, but because of this line of code: https://github.com/Alpha-VLLM/LLaMA2-Accessory/blob/512943221c8872bcc9d11512e5cbcc039ad0b575/SPHINX/inference.py#L62 output from rank1 is blocked, so you did not receive the error message. As communication between the two ranks are needed during model inference, rank0 would be stuck.

To figure out what the problem is, you may comment the aforementioned line of code to see if rank1 outputs any useful message.