Closed HossamAmer12 closed 11 months ago
Because gensim.models.fasttext.FastTextKeyedVectors.load
can work only with full models, not with compressed ones.
To load a compressed model, please use compress_fasttext.models.CompressedFastTextKeyedVectors.load
instead.
Thanks, @avidale .. Can you please share the steps of reproducing this model?
ft_cc.en.300_freqprune_50K_5K_pq_100.bin
A couple of more questions please: 1- What to do if I want to get the full embedding matrix from the CompressedFTKeys? 2- Any difference between your Compressed implementation and normal fast text implementation (no gensim)? In terms of getting the word vector or anything else? [Refering to this link]
1- What to do if I want to get the full embedding matrix from the CompressedFTKeys?
I am not sure that I understand what is a "full embedding matrix". There is a matrix of embeddings of individual n-grams, but each of them is meaningless on its own, only as a part of a word. There can also be a matrix of word embeddings, but it is incomplete, because the number of possible words is infinite, and for most of them, the embeddings are completed on the fly.
And anyway, this question doesn't seem to be relevant to the issue topic.
2- Any difference between your Compressed implementation and normal fast text implementation (no gensim)? In terms of getting the word vector or anything else?
I have no idea how the Facebook fasttext implementation (i.e. the one with "no gensim") works, and I don't guarantee any compatibility with it.
`model_path = "./models/en/ft_cc.en.300_freqprune_50K_5K_pq_100.bin"
big_model = gensim.models.fasttext.FastTextKeyedVectors.load(model_path)
small_model = compress_fasttext.prune_ft_freq(big_model, pq=True) `
Why compress fast text to already uploaded models does not work?
Gives this error: