gustafl / lexeme

A new take on language learning.
1 stars 0 forks source link

Design the separation of lexemes, compounds and expressions #116

Open gustafl opened 8 years ago

gustafl commented 8 years ago

From the early beginnings, I've made the separation between lexemes and their inflections. I imagined that this was all we needed – that everything of importance in a text is either a lexeme or an inflection of the same.

And we could adopt this view, at least as a start. Not only would it make the implementation easier; it would simplify the user interface. But it's clear that there are other meaningful bits of text that are almost as important as lexemes. I've identified compunds and idiomatic expressions as two, but perhaps there are more.

Compounds

Compound words make up a large percentage of the text in the languages I'm acquainted to. For Lexeme to be truly useful as a tool to make sense of text, I really need to support compounds. So a few questions arise. How am I going to separate compounds from lexemes? Assuming I want to link compounds to their lexemes

Separating compounds from lexemes

Even if I don't solve the issue now, I want to have a plan for how to solve it.

gustafl commented 8 years ago

Suddenly, I feel that we may actually drop idiomatic expressions. Focus on vocabulary only. Accept that we can't capture every meaningful bit in the text. Without expressions, there is much less risk of <span> overlaps. We would only struggle with compound words.