Open YoungChanYY opened 1 year ago
出错的位置好像是在predict位置。当取消在训练过程中进行eval处理时,训练得以正常进行。大佬
Traceback (most recent call last):
File "train_bart_text2abc.py", line 180, in cublasSgemmStridedBatched( handle, opa, opb, m, n, k, &alpha, a, lda, stridea, b, ldb, strideb, &beta, c, ldc, stridec, num_batches)
../aten/src/ATen/native/cuda/Indexing.cu:650: indexSelectSmallIndex: block: [0,0,0], thread: [0,0,0] Assertion srcIndex < srcSelectDimSize
failed.
我看看evaluate的逻辑
多谢。
我看到另一处地方,应该有些问题: 在textgen/seq2seq/bart_seq2seq_utils.py的preprocess_data_bart(data)函数中,对target_ids 数据处理的问题和建议如下,大佬看看对不对。谢谢!
def preprocess_data_bart(data): input_text, target_text, tokenizer, args = data ...... target_ids = tokenizer.batch_encode_plus( [target_text],
max_length=args.max_length, #建议代码
padding="max_length",
return_tensors="pt",
truncation=True,
)
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.(由于长期不活动,机器人自动关闭此问题,如果需要欢迎提问)
我用Bart训练代码,每个训练数据都为:输入文本约1000字符,输出文本长3-5万字符。训练几个epoch后会出错,错误信息如下所示。 但是控制输入和输出的字符长度,比如都为100字符左右,则训练正常,没有报错。
请问一下:Bart模型的输入输出长度有什么要求吗,这应该是内部embedding维度出错了吧。谢谢。
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling
cublasSgemmStridedBatched( handle, opa, opb, m, n, k, &alpha, a, lda, stridea, b, ldb, strideb, &beta, c, ldc, stridec, num_batches)