Closed chrishokamp closed 9 years ago
we can just join multi-token alignments with whitespace to form the string representation of an alignment. This will let us avoid using the multi-label binarizer, and will preserve the token order information.
We have to use multi-label binarizer anyway, for the cases when one word is aligned to 2 or more.
Fixed in https://github.com/qe-team/marmot/commit/4b88fc1e0349990481682694fda12f49a2db1bc9
we can just join multi-token alignments with whitespace to form the string representation of an alignment. This will let us avoid using the multi-label binarizer, and will preserve the token order information.