-
在网络上尝试修改transformers版本号 同时修改模型中的config中为对应版本号也没有解决
{
"architectures": [
"TinyllmForCausalLM"
],
"attention_dropout": 0.0,
"hidden_act": "silu",
"hidden_size": 512,
"initializer…
-
### Your current environment
Python==3.10.14
vllm==0.5.0.post1
ray==2.24.0
Node status
---------------------------------------------------------------
Active:
1 node_37c2b26800cc853721ef351c…
-
Hi,
I'm having issue when trying to convert starcoder2-3b with smoothquant to trtllm.
I'm running on a100-40gi.
This is my commad:
`python tensorrt_llm/examples/gpt/convert_checkpoint.py --mod…
-
GPU: 2 ARC CARD
running following example,
[inference-ipex-llm](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/Pipeline-Parallel-Inference)
**for mistral and codell…
-
### Your current environment
Collecting environment information...
PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Ubunt…
-
Trying to do inference on arc GPU machine, have followed this guidelines:
```
https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/Pipeline-Parallel-Inference
and run_mi…
-
```
TypeError: Too few parameters for ; actual 2, expected 3
[2024-04-12 07:26:48,924] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 1263) of binary…
-
### System Info
- GPU name: L40s
- CUDA: 12.1
```
Wed Jun 5 16:27:21 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.14 …
-
使用v0.12.0docker镜像部署,启动命令如下:
sudo docker run -d -v /home/tskj/MOD/:/home/MOD/ -e XINFERENCE_HOME=/home/MOD -p 9997:9997 --gpus all xprobe/xinference:v0.12.0 xinference-local -H 0.0.0.0 --log-level de…
-
### 是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
- [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions
### 该问题是否在FAQ中有解答? | Is there an existing ans…