-
### 提交前必须检查以下项目 | The following items must be checked before submission
- [X] 请确保使用的是仓库最新代码(git pull),一些问题已被解决和修复。 | Make sure you are using the latest code from the repository (git pull), some issue…
-
### Describe the question.
目前使用qlora微调后的internlm2-chat-7b模型跑推理任务,使用model.chat()接口逐条调用,单条推理约5s,由于有大批量数据需要批量推理,效率较低,想问下如何提升单条推理速度或者是否可以批量推理?
-
感谢作者的贡献。观察到榜单中14B模型比72B模型综合能力还强,感觉很困惑。
-
I want to add new tokens to expand the vocab and resize the embedding of a LLM model. And I'm wonder if I can use Xtuner to finetune the embedding layer of the LLM?
-
### Checklist
- [ ] 1. I have searched related issues but cannot get the expected help.
- [ ] 2. The bug has not been fixed in the latest version.
### Describe the bug
在config.ini中修改了use_logn_attn …
xxg98 updated
6 months ago
-
### Describe the bug
![image](https://github.com/InternLM/InternLM/assets/1728593/aae0bba6-f403-462c-a01c-9aa52cc8dafb)
### Environment
lmDeploy版本0.2.2
### Other information
_No response_
-
如题。
报错行:
```python
from lmdeploy import pipeline, GenerationConfig, TurbomindEngineConfig
backend_config = TurbomindEngineConfig(rope_scaling_factor=2.0, session_len=160000)
model_path = "/outp…
-
环境
datasets 2.17.1
transformers 4.37.1
xtuner 0.1.13
模型
internlm2-20b
示例链接
https://github.com/I…
-
### 描述该错误
在https://studio.intern-ai.org.cn/ 的 A100 (1/4) * 2 配置的服务器上 运行
[web_demo.py](https://github.com/InternLM/InternLM/blob/main/chat/web_demo.py)
显式输出
![image](https://github.com/InternLM/…
-
**功能描述 **
梳理文章的大纲
HHH16 updated
3 months ago