Closed speedcell4 closed 1 year ago
Yep it's correct! MBR decoding uses marginals, which depends on global contexts, while Viterbi decoding uses local scores. As such i think it is very intuitive that MBR decoding's better than Viterbi decoding. This is a common finding in the parsing literature, e.g. in TreeCRF based dependency parsing https://aclanthology.org/2020.acl-main.302.pdf
I got it, thank you~
To my understanding it means replacing
MaxSemiring(log_potential)
withMaxSemiring(marginal)
, is this correct? But, why does this work better?