Can you please help to how to fix this error has I am getting it while I am creating the vector
from finbert_embedding.embedding import FinbertEmbedding
finbert = FinbertEmbedding()
sentence_embedding = finbert.sentence_vector(df['Clean_text'])( This is the Origin of the error )
in ()
2 from finbert_embedding.embedding import FinbertEmbedding
3 finbert = FinbertEmbedding()
----> 4 #sentence_embedding = finbert.sentence_vector(df['Clean_text'].values)
5 # Create topic model
4 frames
/usr/local/lib/python3.7/dist-packages/pytorch_pretrained_bert/tokenization.py in _clean_text(self, text)
306 output = []
307 for char in text:
--> 308 cp = ord(char)
309 if cp == 0 or cp == 0xfffd or _is_control(char):
310 continue
TypeError: ord() expected a character, but string of length 69 found
Can you please help to how to fix this error has I am getting it while I am creating the vector
from finbert_embedding.embedding import FinbertEmbedding finbert = FinbertEmbedding() sentence_embedding = finbert.sentence_vector(df['Clean_text'])( This is the Origin of the error )
Create topic model
topic_model = BERTopic(verbose=True,top_n_words=20) topics, probs = topic_model.fit_transform(df['Clean_text'], sentence_embedding)
output:
TypeError Traceback (most recent call last)