lexibank / abvd

CLDF dataset derived from Greenhill et al.'s "Austronesian Basic Vocabulary Database" from 2020.
https://abvd.eva.mpg.de
Creative Commons Attribution 4.0 International
2 stars 2 forks source link

subcognacy question #16

Closed HedvigS closed 1 year ago

HedvigS commented 1 year ago

I have a question about the interpretation of the data in cases where more than one cogancy class is listed.

I thought that if a word had more than one cogancy class listed, then one of them (probably the second) represents a subcognacy class. Like "wahine" in Hawai'ian being "1, 116, 106" and that that means that all forms that get 116 also get 1 (but not all that have 1 get 116).

For water, I found that there were some words that had cognacy "1,2" and some that have "1" and some that have only "2".

Does my original assumption hold and these are a type of error, or is my assumption wrong?

LinguList commented 1 year ago

It is not always clear what people used the double-classes for.

LinguList commented 1 year ago

In my opinion, they should not be used, and as far as I understand, they are trying to fix this now.

HedvigS commented 1 year ago

If the original assumption I had holds, I think it makes a lot of sense to have subcognacy classes. I find them helpful, and I can ignore them if I need.

LinguList commented 1 year ago

The assignment of multiple cognate sets is not systematic, judging from what I have seen. So even ignoring is difficult (@maryewal was checking this for Polynesian, when working on revised judgments, right?)

maryewal commented 1 year ago

Multiple codes might be sub-cognacy or might be partial cognacy (I gave a talk on this at a CoOL meeting). Indeed, it is not always consistent how this has been used, but it is not necessarily an error. I would have to look at the exact item you are referring to in order to speak to this better. As I mentioned in our recent email exchange on this, as we go through and refine coding for certain phylogenies, we are correcting coding. In some cases, and after thorough checks, we have made judgments to exclude 3rd/4th levels of codes.

HedvigS commented 1 year ago

Okay, thank you @maryewal .

I'd be very grateful for general advice on this, especially if what should be done is select the first one, select a random one or have customised methods depending on wether it's sub-cognacy or partial cognacy (which may be detectable by patterns of cognacy values). If there is a chance to include this in the meta-information in the CLDF-release, that would be awesome.

LinguList commented 1 year ago

I guess the question is: what do you NEED this for? If you want to work with lexical data, you could better use the CLDF data from abvdoceanic, which comes with the cognates in a table, where the problem was actively figured out and don't need to bother yourself with it? You also have full phonetic transcriptions there.

HedvigS commented 1 year ago

@LinguList I thought lexibank/ABVDoceanic was just a subset of lexibank/abvd.

LinguList commented 1 year ago

Yes, so you need all of ABVD for your analysis?

HedvigS commented 1 year ago

Yes, for what I want I would like all of ABVD available.

Is the cognacy information for the same concept, form and language different in lexibank/abvd and lexibank/abvdoceanic?

maryewal commented 1 year ago

I'd be very grateful for general advice on this, especially if what should be done is select the first one, select a random one or have customised methods depending on wether it's sub-cognacy or partial cognacy (which may be detectable by patterns of cognacy values). If there is a chance to include this in the meta-information in the CLDF-release, that would be awesome.

There isn't a catch-all rule for this (except that it is never choosing at random!). Refinements to coding have to be addressed group-by-group and the expert linguists are consulted.

maryewal commented 1 year ago

Yes, for what I want I would like all of ABVD available.

Parts of ABVD have not yet been cognate-coded extensively by experts so it would be good to discuss this - what groups you'd like to use and the aim of your project, etc.

SimonGreenhill commented 1 year ago

Can we do this via email and include me? this lexibank repository is a one way dump of ABVD.

HedvigS commented 1 year ago

I can do this over email too, yes no worries Simon.

I tend to leave comments in github issues because that's what I assume is easiest for project maintainers. You can tag 'em, tag people, attach to code etc. I prefer that personally, I find doing this more difficult over email, but I can switch to that. I understand we don't all have the same preferences of course :)