aikuma / aikuma-ng

Speech annotation web app for regular folk
22 stars 13 forks source link

Improve language selector directive #46

Closed Lingomat closed 8 years ago

Lingomat commented 8 years ago

If you enter a custom language of 'taa', this will be added as Taa as a 'chip' in the UI. When you navigate to some other view and then return, the language displayed will be the full string of a language that has 'taa' as the ISO code. This is undesirable.

Also consider that the auto complete should work by typing the iso language code.

There probably needs to be a more robust approach to language name overrides also (we let the user type 中文 to select Chinese. However if they have typed Chinese, they still see 中文. So the best approach will be to implement a language override format, ideally in an external data file loaded along with the iso list, which specifies alternative names keyed by the localisation code.

Netesten commented 8 years ago

In case useful, I referred to https://en.wikipedia.org/wiki/N%C7%81ng_language ; it mentions ISO 639-3 ngh; in case that helps. Just wondering if an Aikuma user might know that level of detail.

Lingomat commented 8 years ago

It turns out fixing this is more difficult because our data structure doesn't match what was originally intended. It needs to be fixed, but I'm sticking it on the backburner for now.

Lingomat commented 8 years ago

A short spell on the backburner :) This is the complete desired behavior for ng-language-selector.

The language selector autocomplete should pick up full language name text and it should pick up language iso codes. It should highlight search text in the autocomplete options (is a flag).

Localised language name overrides. This feature is desirable so that users can use their language to search for languages that are most common for them. It will be necessary to get a list of all aliases to build the autocomplete search string.

We need two fields for a language. We need an optional iso code and we need a full string. There is no getting around this. This is justified because it is often perceived as a political act to use a particular name for a language. We will run into trouble if we tell the user what they must call a language.

Example: The iso code cmn for Chinese Mandarin covers multiple countries. The names for Chinese include 中文, 漢語, 國語 (in taiwan), 普通話 (technical full term for Mandarin). The user should be able to type any of these and see autocomplete entries which all have (cmn) next to them. When they select one, WE SAVE THE ONE THEY HAVE CHOSEN. I think it's easiest just to simply record the actual name string as well as the iso code.

It is desirable to be able to type these aliases in any language. So it seems to me the easiest way to approach this is just to have a single file of aliases similar to the ISO code list and stick that in ext-data/ as well, perhaps as language_aliases.txt. Now we can read this at the same time as reading the ISO csv (in aikumaService). We just stick the data in exactly the same array.

Custom languages. When the user types a language which does not exist and hits return, the language is saved on the user data in the same way as tags. Autocomplete will now recognise this custom language name in the future. While this is being done, we should store a list of all the languages that the user has ever selected, in the user data. This will be used for a language quick selector to be implemented later (and will be incorporated on language selector as a small button that pops up a list of languages the user has ever used, to be selected by mouse click instead of typing). When a custom language is displayed as a chip, we do not display the brackets and isocode because this is irrelevant.

Catch edge case of user typing an iso code or a full language name (or alias) and hitting return rather than using auto complete. The selector should put the full appropriate chip in, then if there is a conflict, the user will see from the expansion off the chip.

Preference custom languages ahead of other results. This should be a matter of placing the custom languages at the top of the array used by autocomplete.

Lingomat commented 8 years ago

Note: This is going to change the data structure and is going to break existing data for users.

This is problematic since Chrome Apps auto update. How do we address this? It seems like we might need a general data format upgrade scheme in case we wish to do this in future.

Lingomat commented 8 years ago

What is the status of this?

lisaslyis commented 8 years ago

For the localized language strings of one language, JSON file will be created, put in the same folder of ISO lang list and loaded in an app to construct the multi string options in language-selector autocomplete input: Chinese(cmn). 中文(cmn), 漢語(cmn), 國語(cmn)

Lingomat commented 8 years ago

Looks good!