microsoft / autogen

A programming framework for agentic AI 🤖
https://microsoft.github.io/autogen/
Creative Commons Attribution 4.0 International
31.56k stars 4.59k forks source link

[Feature Request]: Lazy import of chromadb in teachability.py #1795

Open rickyloynd-microsoft opened 7 months ago

rickyloynd-microsoft commented 7 months ago

Is your feature request related to a problem? Please describe.

teachability.py uses chromadb as its default vector DB. Users can override this with another DB in their own subclasses of Teachability and MemoStore. Unfortunately, the import of chromadb at the top of teachability.py throws an error unless the chromadb package has been installed. Another solution would be to copy and modify the entire teachability.py file, and remove the import, but this would involve owning a lot more code than just the subclass modifications.

Describe the solution you'd like

AutoGen can make it easier for users to plug in their favorite vector DB by importing chromadb within the MemoStore class instead of at the top of teachability.py.

Additional context

No response

rickyloynd-microsoft commented 7 months ago

@shaileshj2803 Does this issue accurately describe the difficulty that the LinkedIn team is experiencing?

hardchor commented 5 months ago

Not sure about LinkedIn, but it accurately describes a problem I'm facing :sweat_smile: We're trying to deploy autogen in a serverless environment, and having an in-memory database won't work, so we had to remove teachability for now.

rickyloynd-microsoft commented 5 months ago

Not sure about LinkedIn, but it accurately describes a problem I'm facing 😅 We're trying to deploy autogen in a serverless environment, and having an in-memory database won't work, so we had to remove teachability for now.

Makes sense, but teachability is not installed by default. So when you say "remove teachability", do you mean "not install teachability"? For instance, the command to install teachability (specifically the chromadb package) would be this:

pip install -e .[teachable]

hardchor commented 5 months ago

Sorry, confusing wording! I meant we had it previously installed as one of our workflows and had to remove it when we discovered that it blew the size of the bundled package beyond what serverless (the framework) via AWS Lambdas was supporting.

Currently, the only solution in that scenario seems to be to write a custom implementation of the teachability capability that integrates with a hosted vector DB, right?

EDIT: Basically exactly what you said above

Another solution would be to copy and modify the entire teachability.py file, and remove the import, but this would involve owning a lot more code than just the subclass modifications.

rickyloynd-microsoft commented 5 months ago

Sorry, confusing wording! I meant we had it previously installed as one of our workflows and had to remove it when we discovered that it blew the size of the bundled package beyond what serverless (the framework) via AWS Lambdas was supporting.

Currently, the only solution in that scenario seems to be to write a custom implementation of the teachability capability that integrates with a hosted vector DB, right?

EDIT: Basically exactly what you said above

Another solution would be to copy and modify the entire teachability.py file, and remove the import, but this would involve owning a lot more code than just the subclass modifications.

Your new subclasses (of Teachability and MemoStore) could be quite small.

hardchor commented 5 months ago

Yeah but the hard requirement on chromadb at the top of https://github.com/microsoft/autogen/blob/main/autogen/agentchat/contrib/capabilities/teachability.py#L5 pretty much means that we can't import and extend any of the existing code unless I misunderstood you?

Unfortunately, the import of chromadb at the top of teachability.py throws an error unless the chromadb package has been installed.

rickyloynd-microsoft commented 5 months ago

Yeah but the hard requirement on chromadb at the top of https://github.com/microsoft/autogen/blob/main/autogen/agentchat/contrib/capabilities/teachability.py#L5 pretty much means that we can't import and extend any of the existing code unless I misunderstood you?

Unfortunately, the import of chromadb at the top of teachability.py throws an error unless the chromadb package has been installed.

Yes, that's why this issue was created. The fix is quite simple. Would you like to create the PR?