Closed yuvarajvc closed 3 years ago
Can we assume that "token_type_ids" will not support in longformer?
Yes, actually yesterday a PR (#9152) was merged to update the docs stating the LongFormer does not support token type ids.
@NielsRogge Thank you
This issue has been automatically marked as stale and been closed because it has not had recent activity. Thank you for your contributions.
If you think this still needs to be addressed please comment on this thread.
transformers
version: 3.0.0Who can help
To reproduce
import torch from transformers import LongformerModel, LongformerTokenizer model = LongformerModel.from_pretrained('allenai/longformer-base-4096') tokenizer = LongformerTokenizer.from_pretrained('roberta-base') SAMPLE_TEXT = ' '.join(['Hello world! '] * 100) # long input document input_ids = torch.tensor(tokenizer.encode(SAMPLE_TEXT)).unsqueeze(0) # batch of size 1 attention_mask = torch.ones(input_ids.shape, dtype=torch.long, device=input_ids.device) global_attention_mask = torch.zeros(input_ids.shape, dtype=torch.long, device=input_ids.device) segment_ids = torch.ones(input_ids.shape, dtype=torch.long, device=input_ids.device) outputs = model(input_ids=input_ids, attention_mask=attention_mask, global_attention_mask=global_attention_mask,token_type_ids=segment_ids)
Error info
IndexError Traceback (most recent call last)