Closed WSC741606 closed 4 months ago
另外微信群的二维码过期了,求新二维码
找到了,512应该是模型本身的序列长度限制,那另一个库应该是自动截断了 但我试图换bge-m3(有8192的序列长度支持)时报错
Batch size: 128
Start with seed: 666
Output dir: ./output/Test_mrl1792
Model_name_or_path: BAAI/bge-m3
Dataset: ../../../Data/Test.train.jsonl
mixed_precision: fp16
gradient_accumulation_steps: 1
temperature: 0.02
log_with: wandb
neg_nums: 15
query_max_len: 128
passage_max_len: 1024
use_mrl: True
mrl_dims: [128, 256, 512, 768, 1024, 1280, 1536, 1792]
/data/home/user/Test/user-Env/lib/python3.9/site-packages/torch/_utils.py:831: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
return self.fget.__get__(instance, owner)()
sentence_transformers model is not mrl model, init scaling_layer weight.
Traceback (most recent call last):
File "/data/home/user/Test/GitLibrary/RAG-Retrieval/rag_retrieval/train/embedding/train_embedding.py", line 190, in <module>
main()
File "/data/home/user/Test/GitLibrary/RAG-Retrieval/rag_retrieval/train/embedding/train_embedding.py", line 117, in main
model = accelerator.prepare(model)
File "/data/home/user/Test/user-Env/lib/python3.9/site-packages/accelerate/accelerator.py", line 1304, in prepare
result = tuple(
File "/data/home/user/Test/user-Env/lib/python3.9/site-packages/accelerate/accelerator.py", line 1305, in <genexpr>
self._prepare_one(obj, first_pass=True, device_placement=d) for obj, d in zip(args, device_placement)
File "/data/home/user/Test/user-Env/lib/python3.9/site-packages/accelerate/accelerator.py", line 1181, in _prepare_one
return self.prepare_model(obj, device_placement=device_placement)
File "/data/home/user/Test/user-Env/lib/python3.9/site-packages/accelerate/accelerator.py", line 1461, in prepare_model
self.state.fsdp_plugin.set_auto_wrap_policy(model)
File "/data/home/user/Test/user-Env/lib/python3.9/site-packages/accelerate/utils/dataclasses.py", line 1367, in set_auto_wrap_policy
raise Exception("Could not find the transformer layer class to wrap in the model.")
Exception: Could not find the transformer layer class to wrap in the model.
请参考https://github.com/NLPJCL/RAG-Retrieval/issues/5 修改配置文件。
明白了,感谢大佬回复~我试试
正常进行训练了,感谢感谢
微信群聊已更新
请教一下大佬,如题,在 https://github.com/NLPJCL/RAG-Retrieval/blob/master/rag_retrieval/train/embedding/train_embedding.py 中有passage_max_len 参数,以BAAI/bge-base-zh-v1.5为例,如果将该参数设置为超过512就会报错
但在生成数据时用的片段长度为1024,这里是怎么处理的?直接截断前512吗?而如 https://github.com/percent4/embedding_model_exp/blob/main/src/finetune/ft_embedding.py 中用sentence-transformer训练就没有遇到这个问题,是sentence-transformer库已经预处理了吗?这个512的限制是可以人为扩展的吗?