Open cxjtju opened 1 year ago
基于chatglm-6b进行增量预训练,请问训练数据每个样本的长度不能超过2048个token吗?
No response
长度超过2048会报ValueError: 130004 is not in list 参考这个repohttps://github.com/shibing624/MedicalGPT
- OS: - Python: - Transformers: - PyTorch: - CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :
Is there an existing issue for this?
Current Behavior
基于chatglm-6b进行增量预训练,请问训练数据每个样本的长度不能超过2048个token吗?
Expected Behavior
No response
Steps To Reproduce
长度超过2048会报ValueError: 130004 is not in list 参考这个repohttps://github.com/shibing624/MedicalGPT
Environment
Anything else?
No response