Closed Jermaine1996 closed 12 months ago
@Jermaine1996 The tokenizer is initialized as padding_side='left
' as default. You can set
tokenizer.padding_side = 'right'
to let the tokenizer append the padding tokens on the right side of the sequence,
or use a processor instead:
processor = ErnieLayoutProcessor(image_processor=feature_extractor, tokenizer=tokenizer)
encoding = processor(pil_image, context, boxes=layout, word_labels=labels, return_tensors="pt")
Thank you for pointing it out, I will update the examples by setting tokenizer.padding_side = 'right'
as default
thanks for responding, it works. :)
Thanks for making this repo,
When I used ErnieLayoutTokenizerFast to tokenize the inputs, I found that the padding tokens would be added at the first of sequences, but I thought it should be added in the end to make all inputs as the same length.
I run the example codes as following:
Then I will get:
Is there something wrong in the codes?