jzohrab / lute

DEPRECATED: LUTE (Learning Using Texts) is a self-hosted web app for learning language through reading, based on Learning with Texts (LWT)
The Unlicense
118 stars 10 forks source link

Change how Lute displays overlapping Terms, so that words/characters aren't written twice #52

Closed jzohrab closed 1 year ago

jzohrab commented 1 year ago

Terms that partially overlap are both displayed. For example, suppose you defined terms "apple ball" and "ball cat". Given the imported text "apple ball cat dog", Lute will show this as "[apple ball][ball cat][dog]". The word "ball" is shown twice, because Lute cannot decide which term should really be shown ... only you know that.

Now, I know that this looks off, but it was the best solution I could come up with! For me studying Spanish, this has only occurred a few times while reading ... e.g. I have the terms "llegar a", and "a ver", which are both common constructs, and very occasionally while reading this has been rendered as "[llegar a][a ver]". It has not been enough of a bother for me to come up with an alternate solution -- after a cursory think, I believe that a good solution to this could be quite complicated, but I'd have to spend time investigating to be sure.

Issue given user @alguien in Discord, for the sentence "开始新生活吧,好吗?" (Note that "生" is rendered twice):

image

Possible solutions, sketches only:

1. "Mouseover reveal overlap" ... maybe something like ... "show the first term completely, and show the second one in such a way that the user knows that it's partially overlapped by the first; and on mouseover of the second term show it fully, and hide part of the first term." Tricky!

2. "Mouseover popup shows full" - "show the un-overlapped portion of the second term, but on mouseover the pop-up shows the full term".

Both of these solutions don't change the fact that Lute is only showing the first term fully, and that perhaps it's really the second term that's the right one, in the context, but at least you wouldn't see weird repeats. Solution 2 is easier, less moving parts.

Comment from user @alguien in Discord:

I see, the system as it is for spanish it sounds like a decent compromise but in chinese i don't think it's as good of an solution as it currently is because, sometimes the overall meaning will get through (不知知道) but other times the combination means something else entirely and the meaning of the text will get distorted(新生生活), so it's far from ideal. It's possible that this happens less with material not aimed at beginners but it's going to still happen from time to time regardless. If it's not as much of a coding error as an unintended consequence of the algorhitm I understand it will be problematic to fix just for one language, so I won't expect a fix anytime soon. I think that having this give priority to the first term would be okay here, not sure about other languages with similar issues but I think that'd parse well with Chinese

3. "Unhighlighted text mode" - Another possible solution, but a big change for the UI/user experience: when reading, add a "render white page" mode or something is which Lute doesn't show the terms as color-coded "chunks" on a page, just show the text pretty much as-is (white page), but for each character/word, on hover, pop up every possible phrase that it's a part of. Then, on un-check of "render white page", all of the color-coded terms show up.