mkabbasi / cleartk

Automatically exported from code.google.com/p/cleartk
0 stars 0 forks source link

handle unknown words in cosine similarity function #416

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
In vector representations unknown words are sometimes modeled by having all 
low-frequency words map to a string like "unk" during training. Right now 
unknown words are handled by always returning 0 similarity. If a map passed in 
has an "unk" string then use it when the words passed in are not in the map.

Original issue reported on code.google.com by tim.mil...@gmail.com on 3 Feb 2015 at 10:22

GoogleCodeExporter commented 8 years ago
This issue was closed by revision 25c9287cfafd.

Original comment by tim.mil...@gmail.com on 3 Feb 2015 at 10:24