Open Wendysigh opened 2 months ago
Yes you're right. The mask is never intended to be anything other than a padding mask. A random mask for a sequence tagging problem doesn't seem standard nor trivial, so I'm less inclined to implement it in the library. But I'm happy to be proven wrong! :-)
Thanks for the reply! I am new to sequence tagging and accidentally want to mask less important tokens. Somehow I want the model to focus on other important labels. That's the initial start for this issue. True as you said, maybe it is not a standard trial lol
I encounter an issue using crf layer when using a random mask, the loss becomes negative after several rounds. And I found this is due to the definition in https://github.com/kmkurn/pytorch-crf/blob/623e3402d00a2728e99d6e8486010d67c754267b/torchcrf/__init__.py#L203.
The code works only when the mask is padding mask. When the mask is a random mask, maybe we need use a function defined as below: