unfoldingWord / translationCore

Repository for the desktop application translationCore
https://www.translationcore.com
Other
36 stars 11 forks source link

WordMAP design sometimes causes suggestions to appear in wrong occurrence order #6538

Open cckozie opened 4 years ago

cckozie commented 4 years ago

2.1.0 (d45bc64) 6237.zip This is working by design but it does not conform to the requirement in #6237. (See that issue for details on design) This example was observed with only the attached two projects in tC. image

da1nerd commented 4 years ago

To elaborate on my comment in https://github.com/unfoldingWord/translationCore/issues/6237#issuecomment-552699174. If we use the first "did" in the example above, then we would have to also discard the first "they" since wordMAP only supports n-grams with contiguous tokens.

e.g. the phrase they(1) did(1) is invalid because their appearance in the target text is discontinuous:

...servants did(1) what Yahweh commanded But they(1) did(2) it...

However, just using did(1) would be valid because it does not include discontiguous tokens.

This is all forced by the current design of wordMAP, which is to only allow contiguous tokens. Supporting discontinuous tokens would open up more possibilities, but would also be more complex to implement.