Open cason0126 opened 4 months ago
I have 910b, how to deploy? Seems only 1.0 support mindspore, can you share how to deoply 1.5 or 2?
I have 910b, how to deploy? Seems only 1.0 support mindspore, can you share how to deoply 1.5 or 2?
看你用cli还是用worker ; 指定device = npu 即可
When I use the Qwen2 series of models for inference in Ascend 910B 。 There are some things that are not normal
When I set the top_p = 1.0, it gets garbled, which is obvious.![image](https://github.com/lm-sys/FastChat/assets/35160064/44c64880-3517-452c-a3a8-5c07a1b9338d)
But when I set it to 0.9, it looks normal.![image](https://github.com/lm-sys/FastChat/assets/35160064/a38c9086-05e1-4849-973b-35c02dcadf44)
At first, I thought it was some problem with the NPU, but when I used the official code like
the result is right when i set top_p = 1.0 , the result is :![image](https://github.com/lm-sys/FastChat/assets/35160064/4d884f27-c20d-4377-82d6-b3b1c66e5216)
both ways are run in same env . Fastchat = 0.2.36 Transformers = 4.37.0
So I've ruled out the issue of the environment for now.
why is this happening?