-
### Search before asking
- [X] I had searched in the [issues](https://github.com/tencentmusic/supersonic/issues?q=is%3Aissue) and found no similar issues.
### Description
大模型输出的sql不稳定,时间条件不满足promp…
-
[07/17/2024-01:56:09] [TRT] [E] Error Code: 4: Internal error: plugin node QWenForCausalLM/transformer/layers/0/attention/wrapper/gpt_attention/PLUGIN_V2_GPTAttention_0
requires 26927499520 bytes of…
-
from vllm import LLM, SamplingParams
from vllm.lora.request import LoRARequest
llm = LLM(model="/home/data/llm/LLaMA-Factory-qwen1.5/models/Qwen-1_8B-Chat", enable_lora=True, trust_remote_code=Tru…
-
When I use the Qwen2 series of models for inference in Ascend 910B 。 There are some things that are not normal
When I set the top_p = 1.0, it gets garbled, which is obvious.
![image](https://gi…
-
Repository Not Found for url: https://huggingface.co/api/models/models/Qwen1.5-32B-Chat/revision/main.
Please make sure you specified the correct `repo_id` and `repo_type`.
If you are trying to acce…
wyq09 updated
3 months ago
-
### Your current environment
pip3 install vllm==0.4.2 nvidia-ammo==0.7.1
Collecting environment information...
PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: …
-
I selected LM Studio provider,but I Get error: Failed to fetch the chat response.
![image](https://github.com/JudiniLabs/code-gpt-docs/assets/1768201/256c456f-5937-4bf7-b1b8-2f726c8485ef)
![image]…
-
![image](https://github.com/user-attachments/assets/311fb12d-8e15-4e9a-8dfe-5311e60451cf)
请问微调速度变慢跟这两句提醒有关系吗?微调速度慢得不太正常~
-
你好,我加载Qwen1.5-72B-Chat模型,报
model = get_peft_model(model, lora_config)
File "/data1/anaconda/env/ysj_qwen/lib/python3.10/site-packages/peft-0.10.1.dev0-py3.10.egg/peft/mapping.py", line 149, …
-
Support Quantized Model.
For example:
https://huggingface.co/THUDM/chatglm2-6b-int4
https://huggingface.co/Qwen/Qwen1.5-72B-Chat-GPTQ-Int4