adding family classification for doculects

thiagochacon commented 9 years ago

Working with the Huber data, I realized that it would be useful if Edictor could also bring genealogical information for each doculect, at least a higher level one such as family classification, but also subgroups if at all feasible.

This is super important for testing hypothesis of genetic relationships between possible distant related languages, but also for identifying homologs across different language families.

In the spirit of Edictor, this could be an additional column, which then could be "switched off" for some particular kinds of analysis.

LinguList commented 9 years ago

Yes, this is a good idea and it is definitely possible to do so. Here are a couple of proposed solutions:

A simple solution (for me) is based on an interface that helps to open the database in a specific "view". Here's an example for sinotibetan languages on which we currently work:
- http://dighl.github.io/sinotibetan/ With this example, you can click on the languages you want to work on, also search for them (the subgroup is in brackets, so searching and selecting only one subgroup is straightforward), and only work on this specific sample.
We could add a similar possibility on subgrouping in the "DOCULECTS" selector of the EDICTOR. So the doculects are (if given) displayed with their affiliation, and selecting only a group is then pretty straightforward by doing the text-search in the field.
We could also (this won't happen soon, since it is a bit complicated) add a tree-viewer with which one can browse the genealogical tree of the group, then maybe even click on a given node, and then allow to click on the languages one wants to work on. This is basically possible with tools I have worked on in the past (see, e.g. the Tree-Explorer at http://dighl.github.io/TREX). It will, however, take some time to implement it.

For the meantime, solution 1 seems to be the easiest to implement. Disadvantage: switching between different subgroups will require to load them in two separate pages, or to re-load the data consecutively. Advantage: since with the interface, not all data is pulled from the database, it is much faster to load it in a first instance and also allows convenient testing of only one language (for error-correction by experts, for example).

LinguList commented 9 years ago

Ah, forgot that: what I would need is a list of subgroups for each doculect, in the form of:

DoculectName   SubGroupName
...

LinguList commented 8 years ago

I'm closing this issue now, since family classification is essentially something to be handled in the settings menu of the edictor.

digling / edictor

adding family classification for doculects #53