Open Wendysigh opened 3 days ago
Yes you're right. The mask is never intended to be anything other than a padding mask. A random mask for a sequence tagging problem doesn't seem standard nor trivial, so I'm less inclined to implement it in the library. But I'm happy to be proven wrong! :-)
I encounter an issue using crf layer when using a random mask, the loss becomes negative after several rounds. And I found this is due to the definition in https://github.com/kmkurn/pytorch-crf/blob/623e3402d00a2728e99d6e8486010d67c754267b/torchcrf/__init__.py#L203.
The code works only when the mask is padding mask. When the mask is a random mask, maybe we need use a function defined as below: