Closed HonLZL closed 3 months ago
tokenized_inputs = tokenizer(
examples[text_column_name],
padding=padding,
truncation=True,
return_overflowing_tokens=True,
max_length=512,
# We use this argument because the texts in our dataset are lists of words (with a label for each word).
is_split_into_words=True,
)
in run_funsd.py
set max_length=512, it works!
FUNSD, lilt-roberta-en-base return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) RuntimeError: CUDA error: device-side assert triggered 7%|▋ | 143/2000 [00:28<06:06, 5.06it/s]
while I find the reason, I find a features's size is 625. batch["input_ids"].shape = (1, 625) . Could you tell me how to fix it.Thanks a lot!