Closed BeckyDTP closed 5 years ago
There are also specialist dictionaries for biology and astronomy and etc terms. So the raw dictionary name need not have a language code embedded.
Also please generate human readable unified diffs (use -u) if at all possible. Thanks
Sure. patch-u.txt
The second condition is definitely for change.
I checked LibreOffice dictionaries. Result (6 not perfect): hunspell-names-v1.pdf
Better (though still not perfect):
name = lang->GetLanguageName(Utility::Substring(0, 2, fix_dict)) + " - " + Utility::Substring(3, fix_dict.length(), fix_dict);
Result (one not perfect – gug) hunspell-names-v2.pdf
Idea:
I hope that then three-letter abbreviations (eg. gug) will be displayed without the full name, because there is no underscore in the name.
gug is not an official iso639, 2 character language name.
yes, it's an official iso639-3. 2-char language names are iso639-1.
But as I said not an official iso639-1 (2 character language name). Yes there are 3 letter and more additional iso specs but we do not use them. Nor does it seem that any other hunspell dictionary.
After thinking, I think there are two options:
or
QString name = dict
Then it will be fair - the list will show the names that the user himself gave to the dictionaries.
Currently, Spanish and French are treated better than other dictionaries.There are too many naming options, so instead of multiplying the conditions - the names of dictionary files will appear in the list.
I think I actually like the last version of the patch with the fallback to just using the dict name. If dictionary makers ignore iso639-1 when naming their dictionary, the defaulting to the dict name makes sense.
I will consider applying this for a future release when things calm down a bit.
Okay, I modified your patch a bit and added the proper gug 3 letter code (from the iso639-3 list). I pushed it to master. I have not tested it at all.
Please do a pull from today's master and check you list again.
Thanks
I tried this method yesterday. It's not perfect, because for these files it repeats the name of the main language:
ca -> Catalan
ca-valencia -> Catalan
de_DE -> German
de_AT_frami -> German
de_CH_frami -> German
de_DE_frami -> German
es -> Spanish
es_ANY -> Spanish
sr -> Serbian
sr-Latn -> Serbian
So you feel it would be better to just use the raw dictionary name?
Perhaps, we do your replace _ - then split the resulting string on "-".
If the result is one part, simply pass it to the language code lookup, if empty go with raw dict name
if the result has two parts or more parts, put the first two back together and lookup, if nothing try with just the first part and if you get something append the remaining unused parts to the name returned. If still nothing, go with raw dict name.
if the result has two parts or more parts, put the first two back together and lookup, if nothing try with just the first part and if you get something append the remaining unused parts to the name returned. If still nothing, go with raw dict name.
It can work. That's exactly what I meant, but it was hard for me to put on words. I think that this will support the vast majority of existing dictionaries, and for others the name of the dictionary file will be displayed.
I will code that up tomorrow and push it to master and let you know so you can test it.
On Mar 26, 2019, at 3:18 PM, Becky notifications@github.com wrote:
if the result has two parts or more parts, put the first two back together and lookup, if nothing try with just the first part and if you get something append the remaining unused parts to the name returned. If still nothing, go with raw dict name.
It can work. That's exactly what I meant, but it was hard for me to put on words. I think that this will support the vast majority of existing dictionaries, and for others the name of the dictionary file will be displayed.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.
Had a few moments this weekend and pushed what wetalked about above. Please give it a good test and let me know what changes we need yet.
Possibly add a space " - "
when name.append (looks nicer on the list) (Czech-CZ -> Czech - CZ)
For me is OK. From 64 dictionaries only kmr_Latn (Northern Kurdish) is displayed as file name, but you will not add the entire 3-letter list of languages, since 99.9% of such dictionaries do not even exist.
Will make that change and close the issue tonight, after work.
On Mar 27, 2019, at 4:40 AM, Becky notifications@github.com wrote:
Possibly add a space " - " when name.append (looks nicer on the list) (Czech-CZ -> Czech - CZ)
For me is OK. From 64 dictionaries only kmr_Latn (Northern Kurdish) is displayed as file name, but you will not add the entire 3-letter list of languages, since 99.9% of such dictionaries do not even exist.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.
Hunspell dictionaries have files with underlining: xx_XX GetLanguage uses names with hyphens: xx-XX
Before
The proposed fix may not be perfect but works. The second condition is for dictionaries such as Polish, which do not have the standard two-letter version (pl), but only the extended one (pl_PL).
patch.txt
After