Open IndianaTones opened 6 years ago
Which algo detects the BIRD thing? If it's lexstat, this would surprise me... I'd just say say they aren't cognate.
No, I agree, they can't be cognate, but why does the alogarithm think they are? There are many equally crazy (to me) examples like this - surely it detects things I cannot see....
It's important to know which method produces this strange output. LexStat has 99 % with your analysis in correct positives, this means it is highly unlikely that it fails on cases like you mention there. But also: which code and which threshold did you use?
Yes, I re-did everything in LexStat, it's very strict but no weird issues.
the compare script is also excellent @LinguList and I do see that my judgments were too loose. I can put this file into Edictor and see side by side the cognancy judgments - I think this should really be an integral part of the workflow. Dogon-Dataset_lexstat.txt Dogon-Dataset_SCA.txt Yet, my question now is, and I know you are working on it, but some of the cognates automatically detected seem crazy to me, and yet the final result (as viewed in Splitstree) remains about the same. So, before I go and change my own judgments, I would like to know what signal the algorithm is seeing? For instance, how does it detect a match for 'BIRD' [nǐ:m] in Bankan_Tey and [kɔ̀nɔ́] in Bambara, AutoCogId 283?! Not only are these supposedly unrelated languages, I can't see how these two forms correlate. Is there a way of teasing apart the script to see what it is doing in the background?