Open manas007 opened 4 months ago
Have you tried installing the latest version of openai
?
@MaartenGr is it required?
@MaartenGr can you suggest why is installing openai required for bertopic ? the documentation does not mention that you need to install that as part of bertopic.
as per doc (https://maartengr.github.io/BERTopic/index.html) Installation, with sentence-transformers, can be done using pypi pip install bertopic
@manas007 Sure! The installation of BERTopic installs the necessary packages needed for everything the base functionality. Since BERTopic is a highly modular package, there are many extensions that you can use that require additional packages. Installing them all at once would clutter the dependencies and likely result in a bunch of dependency conflicts.
This means that whenever you use certain extensions, like the OpenAI offering, the documentation will state that you additionally need to install that specific package.
This also relates to production settings where installing dozens of packages is not helpful, so providing a relative minimal installation is generally preferred. Adding packages with pip is easy, removing cannot be done easily with pip.
@MaartenGr do you mean that i am "required" to install openai along with bertopic , even if i have no intention to use the languague model ?
please note, even with pip install openai, the error does not go away . so this is independent of the openai install.
can you please advise ?
do you mean that i am "required" to install openai along with bertopic , even if i have no intention to use the languague model ?
No, that's definitely not the case. You can use BERTopic without needing to install openai
. Looking at your code and error, it must be a problem with your environment. I just installed BERTopic a couple of times with pip install bertopic
in fresh environments and I do not get this issue. Could you try installing BERTopic from a completely new and empty environment?
thanks, will try that.
Hi! I'm trying to use the following line to load BERTopic (version 0.16.2 installed from pypi) to run some codes in a GPU environment:
from bertopic import BERTopic
But got the below error message:
AttributeError: module 'openai' has no attribute 'OpenAI'
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
File <command-3593022780386937>, line 1
----> 1 from bertopic import BERTopic
File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/bertopic/__init__.py:1
----> 1 from bertopic._bertopic import BERTopic
3 __version__ = "0.16.2"
5 __all__ = [
6 "BERTopic",
7 ]
File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/bertopic/_bertopic.py:48
46 from bertopic import plotting
47 from bertopic.cluster import BaseCluster
---> 48 from bertopic.backend import BaseEmbedder
49 from bertopic.representation._mmr import mmr
50 from bertopic.backend._utils import select_backend
File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/bertopic/backend/__init__.py:8
6 # OpenAI Embeddings
7 try:
----> 8 from bertopic.backend._openai import OpenAIBackend
9 except ModuleNotFoundError:
10 msg = "`pip install openai` \n\n"
File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/bertopic/backend/_openai.py:9
5 from typing import List, Mapping, Any
6 from bertopic.backend import BaseEmbedder
----> 9 class OpenAIBackend(BaseEmbedder):
10 """ OpenAI Embedding Model
11
12 Arguments:
(...)
32 ```
33 """
34 def __init__(self,
35 client: openai.OpenAI,
36 embedding_model: str = "text-embedding-ada-002",
37 delay_in_seconds: float = None,
38 batch_size: int = None,
39 generator_kwargs: Mapping[str, Any] = {}):
File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/bertopic/backend/_openai.py:35, in OpenAIBackend()
9 class OpenAIBackend(BaseEmbedder):
10 """ OpenAI Embedding Model
11
12 Arguments:
(...)
32 ```
33 """
34 def __init__(self,
---> 35 client: openai.OpenAI,
36 embedding_model: str = "text-embedding-ada-002",
37 delay_in_seconds: float = None,
38 batch_size: int = None,
39 generator_kwargs: Mapping[str, Any] = {}):
40 super().__init__()
41 self.client = client
AttributeError: module 'openai' has no attribute 'OpenAI'
Is the openai
module required for loading BERTopic in GPU environment? I also have a CPU environment in which I can load BERTopic without any issues, and I don't have openai
module installed in both environments. Thank you!
@MaartenGr looks like others are facing the same issue.
@csq-dr You mentioned you installed BERTopic using pip, did you also install openai
using pip install openai
? You need to have that installed before you can use OpenAI's offering.
@manas007 Thank you. Did you try installing openai
using pip install openai
and then restarting the notebook?
@MaartenGr yes. i tried that. no luck yet. will keep you posted
@MaartenGr Thanks for the reply. I don't plan to use openai
due to project restriction and plan to use embedding model(s) from HF. I'm wondering if there is a way I can use BERTopic without installing openai
module?
@manas007 @csq-dr
I just created a new environment as follows:
conda create -n bertopic_env python=3.10
conda activate bertopoic_env
Then installed BERTopic as follows:
pip install bertopic
Which installed BERTopic v0.16.2. Then, I tried to run the code you both mentioned which gave me no problems:
from bertopic import BERTopic
In other words, have you both tried creating a completely new environment and created a fresh install of BERTopic? BERTopic does not require openai
and it should be possible to run it without it. However, you might have dependency conflicts or pre-existing installations of openai
(or even an openai.py
file somewhere) that might cause some problems. What typically works best is simply hit the refresh button and start with a new environment.
@MaartenGr Thank you! I tried the code in a brand new environment with 0.16.2 which gave the error message. I've just tried version 0.16.0 which was installed and loaded smoothly without any error messages. By the way, I also find that version 0.16.2 doesn't work even with openai
module installed. Could it be an issue caused by openai.OpenAI()
being a class method not a class attribute?
@csq-dr Could you go step by step how you created a new environment and how you installed BERTopic? As you can see in my response above I can't seem to reproduce the issue following those steps.
Could it be an issue caused by openai.OpenAI() being a class method not a class attribute?
I don't think so since you can use classes as type hints. What is happening here is that openai is imported first with import openai
. After that, it attempts to access openai.OpenAI
.
Since you started from a fresh environment, how can it be that is does not give an error when it runs import openai
? I think that either an old or different version of openai
is installed or that you have a file called openai.py
somewhere near your working directory.
same thing i got with databricks version 14.3LTS (python version 3.10.12). i changed to databricks 15.1 and it fixed (python 3.11.0), and it fixed
@MaartenGr I'm using Databricks same as @amitca71 but with version 13.3 LTS (Python 3.10) for both clusters. My GPU cluster got openai
preinstalled but my CPU cluster didn't have it. I'll try to load bertopic
with Python 3.11 see if the issue got solved.
Edit: updating cluster to 15.1 solves the loading issue. Thank you!
Glad to hear that a different environment solved the issue. It seems that since openai
was pre-installed, it might have been a pre-1.0 version that is not suitable with BERTopic as that version was deprecated by OpenAI. Aside from changing environments, either uninstalling openai
or upgrading openai
might also be a solution for those that wish to use those LLMs.
Hello, Note, this was running as on April 22nd.
!pip install -U bertopic
import bertopic bertopic.version
Error log:
AttributeError: module 'openai' has no attribute 'OpenAI'
AttributeError Traceback (most recent call last) File, line 1
----> 1 import bertopic
2 bertopic.version
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-390f0499-9705-4ba6-ba4a-010185dffaa9/lib/python3.10/site-packages/bertopic/init.py:1 ----> 1 from bertopic._bertopic import BERTopic 3 version = "0.16.1" 5 all = [ 6 "BERTopic", 7 ]
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-390f0499-9705-4ba6-ba4a-010185dffaa9/lib/python3.10/site-packages/bertopic/_bertopic.py:48 46 from bertopic import plotting 47 from bertopic.cluster import BaseCluster ---> 48 from bertopic.backend import BaseEmbedder 49 from bertopic.representation._mmr import mmr 50 from bertopic.backend._utils import select_backend
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-390f0499-9705-4ba6-ba4a-010185dffaa9/lib/python3.10/site-packages/bertopic/backend/init.py:8 6 # OpenAI Embeddings 7 try: ----> 8 from bertopic.backend._openai import OpenAIBackend 9 except ModuleNotFoundError: 10 msg = "
pip install openai
\n\n"File /local_disk0/.ephemeral_nfs/envs/pythonEnv-390f0499-9705-4ba6-ba4a-010185dffaa9/lib/python3.10/site-packages/bertopic/backend/_openai.py:9 5 from typing import List, Mapping, Any 6 from bertopic.backend import BaseEmbedder ----> 9 class OpenAIBackend(BaseEmbedder): 10 """ OpenAI Embedding Model 11 12 Arguments: (...) 32 ``` 33 """ 34 def init(self, 35 client: openai.OpenAI, 36 embedding_model: str = "text-embedding-ada-002", 37 delay_in_seconds: float = None, 38 batch_size: int = None, 39 generator_kwargs: Mapping[str, Any] = {}):
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-390f0499-9705-4ba6-ba4a-010185dffaa9/lib/python3.10/site-packages/bertopic/backend/_openai.py:35, in OpenAIBackend() 9 class OpenAIBackend(BaseEmbedder): 10 """ OpenAI Embedding Model 11 12 Arguments: (...) 32 ``` 33 """ 34 def init(self, ---> 35 client: openai.OpenAI, 36 embedding_model: str = "text-embedding-ada-002", 37 delay_in_seconds: float = None, 38 batch_size: int = None, 39 generator_kwargs: Mapping[str, Any] = {}): 40 super().init() 41 self.client = client
AttributeError: module 'openai' has no attribute 'OpenAI'