moses-smt / mosesdecoder

Moses, the machine translation system
http://www.statmt.org/moses
GNU Lesser General Public License v2.1
1.58k stars 775 forks source link

Added some for sv #221

Closed HjalmarrSv closed 4 years ago

HjalmarrSv commented 4 years ago

Update nonbreaking_prefix.sv

Added Å Ä Ö, which are not unusual initials in names, e.g. Åke/Åsa, Ärling, Östen/Öyvind. Added some new, but mostly variations on the existing ones. Both a dot after each letter (or pair) and a dot only after last letter are accepted forms. A couple of decades ago, there had to be a space after the dot, which explains the third form. The file for sv is much more useful with these few additions. Although, It is still far from complete. Removed: G (occured twice). In this list there is one item that is also a word, even when case is kept: tom. If all words are in small case, then tex, mao, tom (again), may be confused with names, and iaf, etc with named entities.

hieuhoang commented 4 years ago

thanks! merged