digling / burmish

LingPy plugin for handling a specific dataset
GNU General Public License v2.0
1 stars 1 forks source link

cognacy in Nishi's database #91

Closed LinguList closed 7 years ago

LinguList commented 7 years ago

screenshot_2017-02-16_16-42-39

this is the cognate set "be cooked". Apparently, this is an error, since there's no hint for cognacy in Achang and Xiandao with the rest (no further regularity I'd see).

But Nishi has a strange annotation practice, marking things in bold font, but with no apparent reason. Is this related to cognate judgments?

And how do we treat these obvious errors in annotation? Do we stick to them, or do we additionally correct upon them?

If we just ignore them, please close this issue. But if there's an obvious reason for bold fonts, please indicate in the Nishi-readme, which I already updated, that there's bold font, but unclear why.

nh36 commented 7 years ago

The use of bold is already explained in the readme file. The section "Source Information". Effectively these are irregular correspondences that he has no way to explain.

"The others may be regarded as misprints or errors (in bold) in original data."

On Thu, Feb 16, 2017 at 3:45 PM, Johann-Mattis List notifications@github.com wrote:

Assigned #91 to @nh36.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

LinguList commented 7 years ago

But isn't it "abenteuerlich" to say that the words are cognate? I mean, this is worse than edit distance, as there's NO proof in the data for such a pattern.

On the other hand, it's good for the QPA, as this means it'll be easy to show the inconsistency (unless there are more of these cognate sets, which I doubt).

nh36 commented 7 years ago

It is just a guess, but I think what he had in mind was that we have two cognate sets here, but he only had one concept row, so he formatted it as if all are cognate and put them in bold.

LinguList commented 7 years ago

we'll probably find out when looking into the patterns and comparing with our other data. But for this, we need to have the original concepts as mentioned in #90. If we have those for Burling as well as for Nishi, we can extract all overlaps with our version of Huang, Nishi, Mann, and Burling, and see what's going on. If you find time, it would be good if you just checked the data we have already in the edictor apps for Mann and Nishi. As I consider the ortho-profiles done, it is important to check them independently and see how well the conversions worked. I'll prepare the initial concepticon mapping for Burling tomorrow, and Nishi, immediately, once I have the original concepts.