Open loretoparisi opened 6 years ago
Also, which command in fasttext corresponds to the positional-weighting?
I don't think this is implemented in the publicly available codebase.
@matanster you are right, it is not, at least not yet, while the most two recent papers show models that has been trained FastText with position dependent weights.
Do you know if there are any updates of positional-weighting in fasttext base code?
@omriFdna so far I'm not aware of a version that implements the positional weights bow, but I will have a look maybe someone did...
@loretoparisi any luck?
@omriFdna not yet so far :
A dataset with position weights trained with CBOW has been recently released for Wikipedia and Common Crawl - https://fasttext.cc/docs/en/crawl-vectors.html
The model parameters were:
DIM | NGRAM | WS | NEG |
---|---|---|---|
300 | 5 | 5 | 10 |
It would be great to have this as training option as well.
I agree, they also report in their papers that position weights improves the performance, I wish it was part of the training options.
+1
On Thu, Oct 11, 2018 at 2:58 AM Omri notifications@github.com wrote:
I agree, they also report in their papers that position weights improves the performance, I wish it was part of the training options.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/facebookresearch/fastText/issues/445#issuecomment-428842667, or mute the thread https://github.com/notifications/unsubscribe-auth/AHUWqNswFJkmbxOlBP3L8vJjmMogNTYlks5ujuv9gaJpZM4STDKV .
I am attempting to add this feature to Gensim, see https://github.com/RaRe-Technologies/gensim/pull/2905.
The work from https://github.com/RaRe-Technologies/gensim/pull/2905 has been accepted as a journal paper that is to appear in J.UCS 28:2. To conduct experiments for the paper, I have produced the high-level PInE library that uses a fork of Gensim 3.8.3 for model training. Perhaps the PInE library can serve as an inspiration for an implementation to facebook's fastText and to current Gensim. 💡
As for the recent paper "Learning Word Vectors for 157 Languages", the model CBOW is used with position dependent weights. Using that the new pre-trained model were produced. Is it possible to train a unsupervised model with CBOW in this version of fastText using the same approach with position weights?