Open LinguList opened 6 years ago
I just figured that the calculation of the self.frequencies of Concepticon is taking an extremely long time, since it is reading every list, which is hampering our automatic lookup. I would suggest to either store frequencies explicitly in a text-file, maybe in pyconcepticon/data/ and then recompute it once in a while, or to drop it completely (although frequencies are useful).
We can argue that the pysem library offers a more consistent mapping now. We would only need to add cmd line functionality.
The following is unexpected:
Our rule says: if there is no pos-information, penalize this, but top score is only obtained upon identity:
There needs to be a better logic for the scores, and we should have a convincing scoring scheme...