-
Looking forward to adding support for Qwen1.5, including Qwen1.5-7B-Chat, Qwen1.5-7B-Chat-GPTQ-Int8, and so on.
Qwen1.5 is more powerful than Qwen.
Thank you.
-
Thank you for your contributions. I have a question regarding why the pretraining_length is 32384, while in https://huggingface.co/Qwen/Qwen1.5-14B-Chat/blob/main/config.json, the "max_position_embedd…
-
训练代码
```python
from datasets import Dataset
import pandas as pd
from transformers import AutoTokenizer, AutoModelForCausalLM, \
DataCollatorForSeq2Seq, TrainingArguments, Trainer
import to…
-
For `XX` in [A2.7B-Chat, A2.7B]:
Check upon issue creation:
* [x] The model has not been evaluated yet and doesn't show up on the [CoT Leaderboard](https://huggingface.co/spaces/logikon/open_cot…
-
how to reproduce QWEN1.5-7B-CHAT results as you report:
![image](https://github.com/QwenLM/Qwen1.5/assets/17668109/3c634665-4907-4d0b-b60c-b097a9a70981)
i got to TOEFL=30.198 by https://github.c…
-
I am currently verifying all tasks under the `lm-evaluation-harness`. I will raise the issues I encounter one after another in this issue thread. Thank you for your inspection and response! @haileysch…
-
I save qwen1.5-4b and 7b int4 model in my computer, when loaded these models, there are some errors:
Some weights of the model checkpoint at ./models/qwen1.5-4b were not used when initializing Q…
-
如题,基于qwen1.5 14B进行continue pretrain 95B,然后sft,发现超过32k就出现重复问题,解码重复生成某些字符串。
具体做法:在进行100k训练时rope base不变还是1m, max position embedding 和 seq len改成100k。
请问这样做有什么问题嘛?初步看是位置编码没学好?配置问题?
-
For `{XX}` in [0.5B, 1.8B, 4B, 7B, 14B, 32B and 72B]:
Check upon issue creation:
* [x] The model has not been evaluated yet and doesn't show up on the [CoT Leaderboard](https://huggingface.co/sp…
-
按照英文界面的操作,我修改了模型,用的是qwen1.5-7n-chat,然后用python simple_pipeline.py运行,报错如下:
OSError: Not enough disk space. Needed: Unknown size (download: Unknown size, generated: Unknown size, post-processed: Unknown…