-
我先用模型转换的脚本将llama2-7b从huggingface转到megatron,
训练时出现shape问题:
```
Traceback (most recent call last):
File "/code/xx/LLM_mine/reference/Megatron-LLaMA/pretrain_llama.py", line 119, in < module >
…
-
**Describe the bug**
What the bug is, and how to reproduce, better with screenshots(描述bug以及复现过程,最好有截图)
**Your hardware and system info**
Write your system info like CUDA version/system/GPU/torc…
-
The data generation fails with exceeded model's context length. I'm assuming there is something wrong with my input data but it's hard to tell because the error message doesn't give me any pointers.
…
-
I am trying to finetune llama3-70B on trn132xlarge using distributed training. It failed with following error:
Container image: f"763104351884.dkr.ecr.{region}.amazonaws.com/pytorch-training-neur…
-
### Your current environment
We are working on accelerating RLHF algorithms and need to broadcast the weights of the DeepSpeed engine to the vLLM Ray worker. In v0.4.2, we were able to create an ad…
-
I think it would be helpful to be able to import/export templates. So when I load up to run a new model with llamafile, i can simply point it to some template file definition that contains all the req…
-
What would it take for the project to add support for distributed inference?
-
I have been using vllm integration from fastchat to host multiple vllm models. However, it does not offer the full capability of vllm. e.g. It does not support beam search.
I would like to propose…
-
pip install deepspeed
再直接sh ds_all.sh
但出现以下错误,想知道发生了什么?
```bash
zero_nlp-main/chinese_bloom$ sh ds_all.sh
[2023-06-03 12:26:34,143] [WARNING] [runner.py:191:fetch_hostfile] Unable to find hostfil…
-
$ python -m vllm.entrypoints.openai.api_server --host 0.0.0.0 --port 8001 --model Qwen1.5-14B-Chat-AWQ --tensor-parallel-size 2 --quantization awq --trust-remote-code --dtype half
INFO 02-26 1…