I have computed text embeddings using the AzureOpenAI model. I aim to utilize these embeddings as input to a BERTopic object. However, I encounter runtime errors when attempting this. Below is the code snippet I used:
File "/data/home/ilanit.sobol/***/code_files/semantic_modeling/analysis/timeseries_analysis/utils/semantic_modeling_utils.py", line 615, in run_pipeline_on_multiple_datasets
topic_model, topics = self.fit_transform(topic_model, output_dict, self.config.get("optimized"))
File "/data/home/ilanit.sobol/***/code_files/semantic_modeling/analysis/timeseries_analysis/utils/semantic_modeling_utils.py", line 560, in fit_transform
topic_model = topic_model.fit(output_dict.get("texts"),
File "/data/home/ilanit.sobol/anaconda3/envs/llms_env/lib/python3.9/site-packages/pyAudioAnalysis/../bertopic/_bertopic.py", line 316, in fit
self.fit_transform(documents=documents, embeddings=embeddings, y=y, images=images)
File "/data/home/ilanit.sobol/anaconda3/envs/llms_env/lib/python3.9/site-packages/pyAudioAnalysis/../bertopic/_bertopic.py", line 433, in fit_transform
self._extract_topics(documents, embeddings=embeddings, verbose=self.verbose)
File "/data/home/ilanit.sobol/anaconda3/envs/llms_env/lib/python3.9/site-packages/pyAudioAnalysis/../bertopic/_bertopic.py", line 3787, in _extract_topics
self.topic_representations_ = self._extract_words_per_topic(words, documents)
File "/data/home/ilanit.sobol/anaconda3/envs/llms_env/lib/python3.9/site-packages/pyAudioAnalysis/../bertopic/_bertopic.py", line 4087, in _extract_words_per_topic
self.topic_aspects_[aspect] = aspect_model.extract_topics(self, documents, c_tf_idf, aspects)
File "/data/home/ilanit.sobol/anaconda3/envs/llms_env/lib/python3.9/site-packages/pyAudioAnalysis/../bertopic/representation/_keybert.py", line 91, in extract_topics
sim_matrix, words = self._extract_embeddings(topic_model, topics, representative_docs, repr_doc_indices)
File "/data/home/ilanit.sobol/anaconda3/envs/llms_env/lib/python3.9/site-packages/pyAudioAnalysis/../bertopic/representation/_keybert.py", line 163, in _extract_embeddings
repr_embeddings = topic_model._extract_embeddings(representative_docs, method="document", verbose=False)
File "/data/home/ilanit.sobol/anaconda3/envs/llms_env/lib/python3.9/site-packages/pyAudioAnalysis/../bertopic/_bertopic.py", line 3410, in _extract_embeddings
embeddings = self.embedding_model.embed_documents(documents, verbose=verbose)
File "/data/home/ilanit.sobol/anaconda3/envs/llms_env/lib/python3.9/site-packages/pyAudioAnalysis/../bertopic/backend/_base.py", line 69, in embed_documents
return self.embed(document, verbose)
File "/data/home/ilanit.sobol/anaconda3/envs/llms_env/lib/python3.9/site-packages/pyAudioAnalysis/../bertopic/backend/_openai.py", line 73, in embed
response = self.client.embeddings.create(input=batch, **self.generator_kwargs)
AttributeError: 'str' object has no attribute 'embeddings'
Additionally, I have previously used the same code with pre-computed embeddings from Sentence-BERT and specified embedding_model=SentenceTransformer() without encountering this issue.
Could you please provide guidance on how to resolve this error?
I have computed text embeddings using the AzureOpenAI model. I aim to utilize these embeddings as input to a BERTopic object. However, I encounter runtime errors when attempting this. Below is the code snippet I used:
During runtime, I encounter the following error:
Additionally, I have previously used the same code with pre-computed embeddings from Sentence-BERT and specified
embedding_model=SentenceTransformer()
without encountering this issue.Could you please provide guidance on how to resolve this error?
Thank you.