Open ceaysenur opened 2 years ago
It seems like your file is a compressed file. Try to uncompress it first (expecting the file extension is vec).
After that try to see whether you can execute the following script successfully. WordEmbsAug use gensim to load fasttext pre-trained model. So if you can load your file via gensim script, then you should able to initial naw.WordEmbsAug()
from gensim.models import KeyedVectors
KeyedVectors.load_word2vec_format(file_path)
Thank you for the reply. Actually i already used this part before:
from gensim.models import KeyedVectors
KeyedVectors.load_word2vec_format(file_path)
Now I tried it after uncompressing like this
from gensim.models import KeyedVectors
KeyedVectors.load_word2vec_format('/content/drive/MyDrive/cc.tr.300.vec')
text="Bu cümleye benzer cümleler üretilebilir mi?"
aug = naw.WordEmbsAug(
model_type='fasttext', model_path=('/content/drive/MyDrive/cc.tr.300.vec'),
action="substitute")
augmented_text = aug.augment(text)
I still get the following error... Should I had add another line to create a connection between the lines ?
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
[<ipython-input-8-a94c7ffe3d91>](https://localhost:8080/#) in <module>()
7 aug = naw.WordEmbsAug(
8 model_type='fasttext', model_path=('/content/drive/MyDrive/cc.tr.300.vec'),
----> 9 action="substitute")
10 augmented_text = aug.augment(text)
4 frames
[/usr/local/lib/python3.7/dist-packages/nlpaug/model/word_embs/word_embeddings.py](https://localhost:8080/#) in _read(self)
14
15 def _read(self):
---> 16 self.words = [self.model.index_to_key[i] for i in range(len(self.model.index_to_key))]
17 self.emb_size = self.model[self.model.key_to_index[self.model.index_to_key[0]]]
18 self.vocab_size = len(self.words)
AttributeError: 'Word2VecKeyedVectors' object has no attribute 'index_to_key'
Hello,
I am trying to use nlpaug to a dataset and I used with BERT/distilBERT perfectly, it is a great way to augment data. However, when I try to use it with fasttext like this:
aug = naw.WordEmbsAug( model_type='fasttext', model_path=(the_path+'cc.tr.300.vec.gz'), action="substitute") augmented_text = aug.augment(text)
I get the error:
AttributeError Traceback (most recent call last) in ()
2 aug = naw.WordEmbsAug(
3 model_type='fasttext', model_path=("/content/drive/MyDrive/"+'cc.tr.300.vec.gz'),
----> 4 action="substitute")
5 augmented_text = aug.augment(text)
4 frames /usr/local/lib/python3.7/dist-packages/nlpaug/model/word_embs/word_embeddings.py in _read(self) 14 15 def _read(self): ---> 16 self.words = [self.model.index_to_key[i] for i in range(len(self.model.index_to_key))] 17 self.emb_size = self.model[self.model.key_to_index[self.model.index_to_key[0]]] 18 self.vocab_size = len(self.words)
AttributeError: 'Word2VecKeyedVectors' object has no attribute 'index_to_key'
I would like to know what happens here.. I use Google Colab