Fake words corrected/refined and cross-validation upgraded.

lingpy / pybor

A Python library for borrowing detection based on lexical language models

Apache License 2.0

3 stars 1 forks source link

Fake words corrected/refined and cross-validation upgraded. #26

Closed fractaldragonflies closed 4 years ago

fractaldragonflies commented 4 years ago

Fake words with refactoring, brate to use provided interval without change, default to Tokens, verified drop of borrowed words in table before adding fake words.

Cross-validation now handles k-fold and holdout-n methods. Haven't demonstrated holdout-1 across the entire WOLD yet, but for individual languages, I have.

Removed earlier test reports and added most recent test reports.

LinguList commented 4 years ago

Nice, I don't have time to look into details now, but we discussed my errors in fake borrowings, so please just merge, and we all look into the code again before submitting it! Thanks a lot! And make sure you also have a weekend (ours started 12 hours ago ;)

fractaldragonflies commented 4 years ago

Adding the fake words table to paper.

Thinking about incorporating finding on usefulness of bag or words into the neural model. Probably dump Attention until I can see benefit from it (I have some ideas). But the idea of adding a bag of words to the neural model... this I will look into.

tresoldi commented 4 years ago

Results are very high, good. :)