dictionaria / kalamang

Kalamang dictionary by Eline Visser
https://dictionaria.clld.org/contributions/kalamang
Creative Commons Attribution 4.0 International
2 stars 0 forks source link

Can we add an orthography profile? #1

Open LinguList opened 3 years ago

LinguList commented 3 years ago

I checked the IDS data and the orthography profile is easy to create from the phonetics description in the grammar. I will propose a PR to IDS for this resource, so one could probably also add the profile here. Furthermore, as IDS is concepticon-mapped, the Concepticon links should maybe be fed from that IDS dataset? Or is this already done?

elinevisser23 commented 3 years ago

@LinguList I just stumbled over this by chance... Can I help with anything?

LinguList commented 3 years ago

Ah, its the author, very nice you are on GitHub!

LinguList commented 3 years ago

So I checked your Kalamang data on IDS, and added an orthography profile, which basically converts your orthography to normalized IPA, which is useful for standardization. I found three cases (probably loans?) where the data has more than just two vowels (which your grammar says would not occur, if I read it right?).

In any case. Since you have a dictionaria dataset and an IDS dataset, it would be easiest to combine both datasets, so one has a link from one to the other.

But our experiments so far left us a bit helpless, since we could not identify a direct link between the IDS dataset and the Dictionaria dataset (by comparing word forms) in all cases, and I'd also assume, after inspecting the IDS translations for individual concepts, that the IDS meaning descriptions are so loose that they unfortunately force authors to provide a lot of synonyms for concepts which we have much more clearly specified in the Concepticon project.

LinguList commented 3 years ago

One direct question here would be: did you have links from your dictionary to the IDS dataset when you prepared the latter?

LinguList commented 3 years ago

You can btw also just write by email, and we can include the dictionaria editors.

elinevisser23 commented 3 years ago

I am not sure who the Dictionaria editors are (my contact was Iren Hartmann, but she quit a while ago), so I continue here.

No, I did not have links from the dictionary to the IDS dataset. I asked the editors about that, but unfortunately it wasn't possible. I remember I got an excel file with some data pulled from LexiRumah, though.

The 2+ vowel sequences are either sloppy orthography on my side, or multimorphemic words.

elinevisser23 commented 3 years ago

It was Robert Forkel who made that excel file.