bjerva / cwi18

Repository for https://www.aclweb.org/anthology/W18-0518/
Apache License 2.0
0 stars 0 forks source link

Similarity between target and sentence #5

Closed jbingel closed 6 years ago

jbingel commented 6 years ago

On first data inspection, one intuition was that the similarity between a target word/phrase and the rest of its containing sentence could be somewhat predictive, essentially modeling surprisal of the word in context.

jbingel commented 6 years ago

check OOV rate for targets. if high, perhaps use BPE embeddings. Text to BPE in Python like this: https://stackoverflow.com/questions/8870261/how-to-split-text-without-spaces-into-list-of-words

jbingel commented 6 years ago

implemented this in https://github.com/bjerva/cwi18/commit/fcdb090bdfefe4921f9c16327c4f3e15e4bdb4ca, but takes forever, perhaps gotta compute offline and load from textfile