No context in vocabulary when setting an positive bow_size

orenmel / lexsub

Apache License 2.0

24 stars 16 forks source link

No context in vocabulary when setting an positive bow_size #5

Open MichaelZhouwang opened 5 years ago

MichaelZhouwang commented 5 years ago

Hi, When I set an positive non-negative window_size, the contexts extracted are done by method 'getneighbors' and the words returned are just words, without context prefix such as 'acompI', which results in no contexts in vocabulary because the context vocabulary are all prefixed. Is the model only built for syntactic dependency contexts? Or I could simply average the context embedding for all syntactic to obtain an 'average context embedding' ? Look forward to your reply. Thanks !

orenmel commented 5 years ago

When using bow>0, the code expects to get standard word embeddings, i.e. without syntactic prefixes. You could try to do averaging over prefixes or simply learn a standard word2vec model and dump the target and context embeddings into two separate files. I think you can find instructions on how to do that on the web.

MichaelZhouwang commented 5 years ago

When using bow>0, the code expects to get standard word embeddings, i.e. without syntactic prefixes. You could try to do averaging over prefixes or simply learn a standard word2vec model and dump the target and context embeddings into two separate files. I think you can find instructions on how to do that on the web.

Thank you very much!