Open mshahriarinia opened 11 years ago
Computing word co-occurrence is something we could do. And we could add it as part of our training set.
c(word1, word2)
It is typical to use things like the links you listed. Don't let these word counts stop you from proceeding. There will be some inaccuracy to begin with we just need to record these assumptions and continue.
We need to have a measure of the word-meaning frequency/counts for proper assignment to entity or slots. For example the words with multiple meaning with the same POS, have a certain probability of being used with each meaning. We need to recognize this to order our matches. Wordnet
lemma.tagcount
has something like this but is not complete and is full of zeros we need to another measure.Refer: http://stackoverflow.com/questions/12943193/how-to-measure-wordnet-term-frequency-values-and-cooccurence-value-programatical http://stackoverflow.com/questions/5928704/how-do-i-find-the-frequency-count-of-a-word-in-english-using-wordnet
We might be able to use http://corpus2.byu.edu/coca/100k_data.asp?query=1 http://americannationalcorpus.org/OANC/index.html#download http://googleresearch.blogspot.com/2006/08/all-our-n-gram-are-belong-to-you.html