"kø" in text-to-lexemes extracts only "k"

fnielsen / ordia

Wikidata lexemes presentations

https://ordia.toolforge.org

Apache License 2.0

24 stars 13 forks source link

Closed fnielsen closed 3 years ago

fnielsen commented 4 years ago

"kø" in text-to-lexemes extracts only "k"

jhsoby commented 3 years ago

This is fixed already, isn't it? Works for me at least.

fnielsen commented 3 years ago

jhsoby commented 3 years ago

Aha. It only happens for that mode (lowercase first sentence letters), and it happens for all two-letter words, so it's not a Unicode issue. I looked at the code, and I think the culprit is the 2 in this line: https://github.com/fnielsen/ordia/blob/90b1b91344e42bf0e44d949fddd471d131e38df3/ordia/text.py#L45

fnielsen commented 3 years ago