Open saharghannay opened 4 years ago
There's no explicit support for any particular 'fine-tuning' operation. And, the .intersect_word2vec_format()
method was an experimental offering, once available on Word2Vec
(and thus inherited by some other classes), which was confined to Word2Vec
only by a prior refactoring. Whether it still has any use, or could potentially be adapted to other classes, is something a user would have to look at the source code & decide for themselves.
If you were pursuing a specific, well-documented fine-tuning approach, and had some specific feature need to support that, that could be a legitimate feature-request. (It's unlikely the original .intersect_word2vec_format()
would be just right for any fine-tuning approach.) But we'd need a clearer, implementable description of what was needed.
Problem description
I would like to fine-tune a fasttext embeddings model trained on wiki data on new in domain data, I was using this code;
Steps/code/corpus to reproduce
model = KeyedVectors.load_word2vec_format(args.pretrained_model,binary=False) model_Fasttext_cbow = FastText(size=args.vector_size, window=args.window, min_count=args.min_count, workers=8,sg=0) model_Fasttext_cbow.build_vocab(sentences) total_examples = model_Fasttext_cbow.corpus_count model_Fasttext_cbow.build_vocab([list(model.wv.vocab.keys())], update=True) model_Fasttext_cbow.intersect_word2vec_format(args.pretrained_model, binary=False,lockf=1.0) model_Fasttext_cbow.train(sentences, total_examples=total_examples, epochs=5)
But I got this error : AttributeError: 'FastText' object has no attribute 'intersect_word2vec_format'
How can I fix this problem ?
Versions
Please provide the output of: