digling / intelligibility

MIT License
0 stars 0 forks source link

Tasks that we can test #5

Closed LinguList closed 10 months ago

LinguList commented 1 year ago

@justalingwist, with the data we have assembled, we can check the following tasks, if I am not mistaken:

  1. learn to predict one word from one language in the other language for cognates and the borrowing-cognates
  2. learn to predict the meaning of a word one "hears"
  3. learn to predict how a word one wants to express "sounds"

For 1, I think you can use LDL to predict one word in Dutch from a word (cognate) in German, right? If not, we should test this via machine learning (seq2seq / transformers) to see how well we predict phonetic transformations.

For the other tasks, it would only be a difference in the tests, not in the training, since the setting would be identical for all tasks (learn one language, see if you capture the other).

Is that more or less correct?

LinguList commented 1 year ago

BTW, @justalingwist, I now updated the data, you find sound classes in the files for celex in data.