Open jfaraklit opened 1 year ago
@jfaraklit, you can dome something like this:
from langchain.vectorstores import Chroma
import chromadb
chroma_client = chromadb.HttpClient(host="http://my-chroma.ec2.aws.com:8000")
def get_chroma(client):
chroma = Chroma(
collection_name='llm',
embedding_function=embedding,
client=client
)
return chroma
self.db = get_chroma(client)
and something like this to use it
self.db.add_documents(....)
Yeah, that works. Thanks a lot, Another strange issue I am getting. When I run my app and add documents to the db and when I query I get result. If I stop my app and start it again - now the same query returns [] empty array. This happens in both the local db and as server. Any thoughts?
@jfaraklit, In local DB, you need to use PersistentClient(path="/path/to/data")
whereas in the server, you need to make sure you have IS_PERSISTENT=1
env var.
How do you run the server, and can you share your local client code?
@tazarov Actually I was running a delete command somehow that is why on the subsequent start the collection was deleted. so figured that one.
Gald to know that IS_PERSISTENT=1 is for the server. So basically it will create and keep the data on the Ec2 container? Also - how do you see/query data on the ec2 container? do I need to install chroma cli or anything on the container? if you can point me to some docs or show me one command to count the docs I will figure the rest.
@jfaraklit, for AWS, have you tried - https://github.com/chroma-core/chroma/tree/main/examples/deployments/aws-terraform
It will create an EC2 with an ESB volume mounted at /chroma-data
where your chroma data will be stored.
Just FYI, the default chroma docker deployment keeps the data in the container, so it is not a good candidate if you can't afford to "lose" your data. Of course, you can add local mounts (which the above AWS deployment does)
yeah, this is neat and I need ebs volume. I will move to this type of ec2 set up soon. at the meantime, when I added IS_PERSISTENT I got the below error.
def get_chroma(client):
chroma = Chroma(
collection_name='llm',
embedding_function=embedding,
#persist_directory='./chroma.db',
IS_PERSISTENT=1,
client=client
)
return chroma
File "/Users/jawed-mac/ELL/openAI/RandD/crafted-catalyst/realtime_ai_character/database/chroma.py", line 19, in get_chroma chroma = Chroma( TypeError: init() got an unexpected keyword argument 'IS_PERSISTENT'
@jfaraklit IS_PERSISTENT is used for client/server deployment mode where it is passed as env var to the server. Maybe I am confusing you. If LC is what you want to use then keep using persistent_directory as this is the config value you need for local persistent client.
Got it. thanks
@jfaraklit, you can dome something like this:
from langchain.vectorstores import Chroma import chromadb chroma_client = chromadb.HttpClient(host="http://my-chroma.ec2.aws.com:8000") def get_chroma(client): chroma = Chroma( collection_name='llm', embedding_function=embedding, client=client ) return chroma self.db = get_chroma(client) and something like this to use it self.db.add_documents(....)
Can you please explain what the term "client" refers to in this code?
@abhishek351 client
is the python class that creates a connection to the DB and ferries request to the DB. With langchain
- it's helpful and more flexible to define the client outside langchain
and pass it in
What happened?
This is how I use chroma locally
In my app I do self.db = get_chroma() and something like this to use it self.db.add_documents(....) etc
Now I have hosted a version in EC2. How to use the hosted version in my app instead of local.
Versions
latest
Relevant log output
No response