-
Some weights of the model checkpoint at ./BAAI/bge-reranker-v2-minicpm-layerwise were not used when initializing LayerWiseMiniCPMForCausalLM: ['lm_head.0.linear_head.weight', 'lm_head.1.linear_head.we…
-
Hi guys, i have some questions about finetuning.
1. In `finetune_lora.sh`, q and k are selected for lora_target_modules. My understanding is that, considering the efficiency of LoRA, q and v should b…
-
### Description / 描述
https://github.com/OpenBMB/LLMFarm-MiniCPM
这个文档里选择完模型之后,第二步,IOS App LLM Farm 上找不到这个模板,导致模型无法使用
![image](https://github.com/OpenBMB/MiniCPM/assets/41416092/64e98d0b-0ea5-4588-…
-
- Make training process faster
- All training sample should be padded to the same `max_length`
-
RT
看到 swift 里面有 V1 的微调代码,请问在 V2.0 上可以直接使用么,还是需要重新开发一下?
-
在线的demo一直请求失败,api调用也显示http无法连接,麻烦看下是不支持在线使用了吗?
-
Hello,
It's a great work! And there are several questions:
1. In the technical report you mentioned
> We find that LoRA empirically leads to better performance than fully tuning across all c…
-
我使用如下代码进行推理,显存占用在10G以上,这对于一个2B模型是否太多了?
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
from peft import PeftModel
import json
torch.manual_seed(0)
lora_pat…
-
**Describe the bug**
RuntimeError: output with shape [1, 210, 446] doesn't match the broadcast shape [3, 210, 446]
**Your hardware and system info**
CUDA11.8 pytorch2.2 A100-40g
**Addition…
-
### 自定义数据集(dataset.jsonl)样式
{"query": "问题问题问题", "response": "答案答案", "history": [["历史问题1", "历史回答1"], ["历史问题2", "历史回答2"]], "images": ["图片路径"]}
说明:
dataset.jsonl中每一行是一个json,有若干行,每张图片对应一行json
由于数据集暂…