kmkurn / pytorch-crf

(Linear-chain) Conditional random field in PyTorch.
https://pytorch-crf.readthedocs.io
MIT License
935 stars 151 forks source link

what pad symbol to use in tags tensor #52

Closed briemadu closed 4 years ago

briemadu commented 4 years ago

Hi,

first of all, thanks for making this code available : )

I would like to check something: what padding symbol should we use in the tags tensor?

If my tags go from 0 to 11, I was using 12 as a pad symbol. But it throws an index error in _compute_score. It works if I replace 12 by, say, 11. But since 11 is a valid tag symbol, I just want to be sure that the mask does take care of not considering these values. Or if I should use another value.

kmkurn commented 4 years ago

Hi, thanks for your kind words!

Your padding tag should lie between 0 and num_tags - 1, and the mask will take care not to include them.

possible1402 commented 4 years ago

Hi, I'm confused that if padding tags is between 0 and num_tags-1, it must be same with one of the tags. How can the mask identify if the element is a true tag or a padding tag?

kmkurn commented 4 years ago

Hi, it's assumed that when mask is 1, it's a true tag. Otherwise, it's a padding tag.