Open GoogleCodeExporter opened 9 years ago
Will try to fix this w/ the next release. :) Funny since my alphatbet also has
non-ascii letters (æøå), but I never tried the analysis/generator with them.
Original comment by tristesse
on 10 Jan 2009 at 3:27
I have the same problem.
Actually, in this case the program generates two statistics entries. One entry
containing the first half of the word before the umlaut and/or the sharp s
ligature (eszett) “ß” or “ẞ” and another entry containing the second
half of the word. So, for Zweckmäßigkeit you get “Zweckm” and
“igkeit”. This is really annoying in autogenerated lessons as none of both
entries is an actual German word.
I’ll try words like “Überfälle”, “tränenüberströmt”,
“Gehäusegröße” and “Ölüberschussländer” to check how many parts
are produced.
Should be:
berf, lle
tr, nen, berstr, mt
Geh, usegr, ße
and
l, berschussl, nder
Original comment by albedosh...@gmail.com
on 12 Aug 2011 at 9:24
Now, this is odd.
The program doesn’t have any problem recognizing the special characters in
the trigram analysis.
My test file contains these words, including some non-German words, because I
had the suspicion that the lesson generator doesn’t see non-ASCII characters
at all:
Überfälle tränenüberströmt Gehäusegröße Ölüberschussländer
Ölüberschußländer Geräteüberhöhung Gefäßüberdehnung Löß süß
FAẞBIER Øresund Ælfwine Cœr C&A
So, this is what it looks like in the typer (see image “01-Typer.png”).
I mistyped every word in the lesson, to be sure all words are used in the
lesson generator. But the lesson generator produces this lesson (see image
“02-Generated lesson.png”).
Now check the word analysis (see image “03-Analysis.png”).
The funny part is that the trigram analysis works perfectly fine (see image
“04-Trigrams.png”).
Suspicion confirmed ;)
So the word analysis seems to ignore any non-ASCII character in the text which
obviously leads to erroneous auto-generated lessons and faulty word analyses
— whereas the letter and trigram analysis work as intended.
Original comment by albedosh...@gmail.com
on 12 Aug 2011 at 11:05
Attachments:
Original issue reported on code.google.com by
quietdeath@gmail.com
on 6 Jan 2009 at 10:39