Open thariq-nugrohotomo opened 8 months ago
Using the pretrained model, when I pass cls or bos as the initial decoder token, the output (first decoded token) rarely get correct. But once I try to use eos, the output is correct, or at least similar with the output returned by model.generate().
cls
bos
eos
model.generate()
In the official code from Microsoft, they will fallback to eos if the token is not specified https://github.com/microsoft/unilm/blob/6f60612e7cc86a2a1ae85c47231507a587ab4e01/trocr/generator.py#L84
Code excerpt to manually see the first decoded token:
decoder_start_token_id = processor.tokenizer.eos_token_id # processor.tokenizer.bos_token_id x = model(pixel_values, torch.tensor([[decoder_start_token_id]])) x = x.logits x = torch.argmax(x, -1) print(processor.tokenizer.batch_decode(x))
Switch eos_token_id to bos_token_id then observe the different output.
eos_token_id
bos_token_id
Check out this pull request on
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB
Using the pretrained model, when I pass
cls
orbos
as the initial decoder token, the output (first decoded token) rarely get correct. But once I try to useeos
, the output is correct, or at least similar with the output returned bymodel.generate()
.In the official code from Microsoft, they will fallback to
eos
if the token is not specified https://github.com/microsoft/unilm/blob/6f60612e7cc86a2a1ae85c47231507a587ab4e01/trocr/generator.py#L84Code excerpt to manually see the first decoded token:
Switch
eos_token_id
tobos_token_id
then observe the different output.