Open pypae opened 5 years ago
The original script added that new hack that changed quite recently: https://github.com/moses-smt/mosesdecoder/pull/204
This difference isn't accounted for in sacremoses. And I'm really not sure whether we should or not.
Why sacremoses shouldn't include this?
I could not yet figure out why, but in the original script, the dot in
p.m.
at the end of a sentence is not split up, while with this port it is.The original script even explicitly leaves out
p.m
from its nonbreaking prefixes, so i'd expect the behavior seen in the port.