A Pair that should be a cognate but are notː prefix problem, I think.

sillsdev / cog

Cog is a tool for comparing languages using lexicostatistics and comparative linguistics techniques.

http://sillsdev.github.io/cog/

MIT License

23 stars 10 forks source link

A Pair that should be a cognate but are notː prefix problem, I think. #54

Closed Steve-Miller closed 8 years ago

Steve-Miller commented 8 years ago

See the screen shot:

cog-non-cognate_cognate

I have here a word that is identical to another in the cognate list, but nevertheless is listed as a non-cognate. I think at issue is that for Uni (Ramo), Cog still thinks "ʔa" is a prefix.

I did run the stemmer earlier in the project, and it came up with "ʔa" as a prefix. I disagreed, and have since eliminated ʔa as a prefix. However, Cog is persistent, and retained the prefix in the data, when it shouldn't, I don't think. In the .cogx file, I find this lineː

    <Word meaning="lap">ˈʔa.wo.to</Word>

Then further down, I find this lineː

    <Word meaning="lap" stemIndex="3" stemLength="6">ˈʔa.wo.to</Word>

As far as I can tell, this last line became incorrect once I removed the prefix.

This is the same .cogx that has been through a couple of major crashes, so I don't know if that has something to do with it or not.

ddaspit commented 8 years ago

If you make any changes to the affixes for a variety, you will need to run the stemmer again. You will want to use the "Only use already specified affixes" option when you run it again.

Steve-Miller commented 8 years ago

It makes sense that I would have to run the stemmer again, once you said it. It's just that something like a week passed between working on affixes and looking at the multiple word alignment, and I didn't make the connection between the two.

Running the stemmer again seems to have cured the problem I was looking at. Suggestion: In the Input / Varieties / Affixes area, light up an option to rerun the stemmer whenever an affix is added, edited, or deleted.

Steve-Miller commented 8 years ago

So I reran the stemmer and went back to Multiple Word Alignment. I looked up the gloss 'lap', the same as I reported above. Nothing there. Just a blank screen. That was a shock. I wanted to yell at Cog, "What did you do to my data?" Another gloss offered up a blank screen, too.

So I took a deep breath and tried to think through what might have caused that. I thought, hmm, I wonder what would happen if I ran "Compare all variety pairs" again? I did, and that appears to have cured my blank screen.

I imagine there are various ways to avoiding more linguist high blood pressure. Maybe a yes/no dialog could come up whenever the stemmer finished: "Would you like to compare all variety pairs again?"

Maybe another thing to do is avoid a blank screen on Multiple Word Alignment. Maybe something like: "If your screen is blank, you might want to compare all variety pairs."

I dunno. You know the software better than I do. I'm just trying to avoid having someone else go through the same surprise I did.

Steve-Miller commented 8 years ago

One more note: I was somewhat surprised to see the Bouni (Sumo) word come up as a cognate the first time. The second time, it didn't stay a cognate. I think this makes sense, but bring it up in case it needs to be checked.

ddaspit commented 8 years ago

I added text to the multiple word alignment view to prompt the user to compare all variety pairs if a comparison is needed.