stanfordnlp / GloVe

Software in C and data files for the popular GloVe model for distributed word representations, a.k.a. word vectors or embeddings
Apache License 2.0
6.83k stars 1.51k forks source link

Initialize model with pretrained words vectors #62

Open Threynaud opened 7 years ago

Threynaud commented 7 years ago

Hi!

I was wondering if it was possible to initialize a GloVe model with one of the pre-trained embeddings before training it on a dataset of my own? I'm not sure to see how it can be done..

Thank you!

ghost commented 7 years ago

Yeah, this is not going to work well due to the optimization setup. But what you can do is train GloVe vectors on your own corpus and then concatenate those with the pretrained GloVe vectors for use in your end application.

Threynaud commented 7 years ago

Thanks for that quick reply! I will try that out!

niuhuakang commented 6 years ago

Hi, I want to know how to "concatenate those with the pretrained GloVe vectors for use in your end application". Is the new trained vectors comparable to the old trained vectors? @Russell91

JesseTG commented 6 years ago

But then wouldn't you lose dependencies between words in the old model and words in the new one?

sashaostr commented 5 years ago

same questions here, does anyone has answers please?

stevenwernercs commented 5 years ago

Am I correct in thinking that GloVe (Global Vectors) is not meant to be appended? Since it is based on the corpus' overall word co-occurrence statistics from a single corpus known only at initial training time..

alexandrefelipemuller commented 4 years ago

Hi, I couldn't find any solution on all this thread including #67 #77 .. does any one have an source of example?

ghost commented 4 years ago

Bummer

On Mon, Nov 4, 2019, 9:16 AM alexandrefelipemuller notifications@github.com wrote:

Hi, I couldn't find any solution on all this thread including #67 https://github.com/stanfordnlp/GloVe/issues/67 #77 https://github.com/stanfordnlp/GloVe/issues/77 .. does any one have an source of example?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stanfordnlp/GloVe/issues/62?email_source=notifications&email_token=AAIFEMP75T2DUZD6XOCDFETQSBDFRA5CNFSM4DAZFUQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEC7ZXXY#issuecomment-549428191, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIFEML7Q55LJV33JPV4TE3QSBDFRANCNFSM4DAZFUQQ .