Open whysirier opened 3 months ago
No response
请问微调中使用本地数据时配置json格式的文件中的id参数每个样本可以设置为一样的吗?
1)id一样: [ { "id": "identity_0", "conversations": [ { "from": "user", "value": "你好" }, { "from": "assistant", "value": "我是VL。" } ] }, { "id": "identity_0", "conversations": [ { "from": "user", "value": "Picture 1: /mnt/data/images/cat217.jpg\n图中有几只猫?" } ] } ] 2)id不一样 [ { "id": "identity_0", "conversations": [ { "from": "user", "value": "你好" }, { "from": "assistant", "value": "我是VL。" } ] }, { "id": "identity_1", "conversations": [ { "from": "user", "value": "Picture 1: /mnt/data/images/cat217.jpg\n图中有几只猫?" } ] } ]
两种方式对训练效果有影响吗
应该不影响,之前考虑过这个问题,查找源代码的过程中并未发现该id被使用
起始日期 | Start Date
No response
实现PR | Implementation PR
No response
相关Issues | Reference Issues
No response
摘要 | Summary
请问微调中使用本地数据时配置json格式的文件中的id参数每个样本可以设置为一样的吗?
基本示例 | Basic Example
1)id一样: [ { "id": "identity_0", "conversations": [ { "from": "user", "value": "你好" }, { "from": "assistant", "value": "我是VL。" } ] }, { "id": "identity_0", "conversations": [ { "from": "user", "value": "Picture 1:
/mnt/data/images/cat217.jpg\n图中有几只猫?"
}
]
}
]
2)id不一样
[
{
"id": "identity_0",
"conversations": [
{
"from": "user",
"value": "你好"
},
{
"from": "assistant",
"value": "我是VL。"
}
]
},
{
"id": "identity_1",
"conversations": [
{
"from": "user",
"value": "Picture 1:
/mnt/data/images/cat217.jpg\n图中有几只猫?"
}
]
}
]
缺陷 | Drawbacks
两种方式对训练效果有影响吗
未解决问题 | Unresolved questions
No response