Signbank / Global-signbank

An online sign dictionary and sign database management system for research purposes. Developed originally by Steve Cassidy/ This repo is a fork for the Dutch version, previously called 'NGT-Signbank'.
http://signbank.cls.ru.nl
BSD 3-Clause "New" or "Revised" License
19 stars 12 forks source link

Functionality to suggest lemmas for existing glosses. #601

Open Woseseltops opened 4 years ago

Woseseltops commented 4 years ago

Idea coming out of a conversation with @susanodd: much more cool results will come out of #354 if all glosses that belong to one lemma are also identified as such in Signbank. Perhaps we can help the sign linguists and create functionality that:

  1. Lists all glosses that are very likely to be part of one lemma based on their name (for example: BAL-A and BAL-B).
  2. For each item in the list, create a button that automatically creates this lemma and rewires the existing glosses. (So, one lemma 'BAL', and 'BAL-A' and 'BAL-B' get this as their lemma).

@ocrasborn what do you think?

ocrasborn commented 4 years ago

I wish it were this simple! In most cases, the -A -B -C variants are synonyms, and stem from dialectal variation. Hopefully we can find evidence for this from the regional frequencies. But I do like the idea, we'd have to think about an algorithm that takes this regional variation into account. We now mostly group items in a lemma if they're phonologically related, it seems. So perhaps like this: suggest a lemma relation iff there are no more than two phonology fields different and iff there is not strong evidence for dialectal variation from the regional frequencies info. Would that be doable?

Woseseltops commented 4 years ago

Aha okay apparently a 'lemma' is not what I thought it was.

So perhaps like this: suggest a lemma relation iff there are no more than two phonology fields different and iff there is not strong evidence for dialectal variation from the regional frequencies info. Would that be doable?

Yes doable, but then we still don't have what we need for #354 : a way to find interesting cases like gloss a and gloss b mean the same thing, but in region 1 they use gloss a and in region 2 they use gloss b. Apparently, it's not just lemmas but mainly synonyms that we are interested in.

susanodd commented 4 years ago

I added variants to the Frequency View tab. If there are variants (relations saved by the user, or otherwise pattern variants when there is no other relation between the glosses). Issue #602 needs to be done first to fill the tables.

susanodd commented 4 years ago

It still won't make suggestions, but if there are variants, they will show up so the user doesn't need to keep searching for them.