Hello
I'm working on a book information retrieval system. I have books as PDFs and I am using Ollama to generate embeddings and weaviate as vector store. In the vector store I made a multi-tenant collection called Books and each book is a tenant.
For brevity I will only use one tenant, which is Shakespeare's Othello as PDF.
Below I present the two functions I am using to create the vector store and to populate it. You can see that the vector store is being populated with the text as it should, however the vector itself is empty even when I set it in the batch.add_object method. You can check the example output at the end.
Here are the two functions I use:
def create_vector_store(client, collection_name: str, tenant_name: str):
'''
Creates the vector store using Weaviate.
Uses multi-tenancy with each book as a tenant.
Args:
client (WeaviateClient): Weaviate client
collection_name (str): Name of the collection in which store vectors.
tenant_name (str): Independent allocation in the collection to each book.
Return:
books_collection
books_tenant
'''
try:
books_collection = client.collections.create(
name = collection_name,
multi_tenancy_config = wvc.config.Configure.multi_tenancy(
enabled = True,
auto_tenant_creation = True,
auto_tenant_activation = True
)
)
books_collection.tenants.create(tenants = [wvc.tenants.Tenant(name=tenant_name)])
books_tenant = books_collection.with_tenant(tenant_name)
print('[i] Created collection with tenant.')
except:
books_collection = client.collections.get(collection_name)
books_tenant = books_collection.with_tenant(tenant_name)
print('[i] Fetched collection with tenant.')
return books_collection, books_tenant
def populate_vector_store(collection, documents: List[Document], tenant_name: str):
'''
Popular vector store with objects.
Args:
collection
documents
tenant_name
'''
with collection.batch.dynamic() as batch:
for d, document in enumerate(documents):
response = ollama.embeddings(
model = 'mxbai-embed-large',
prompt = f'''
Represent this text for information retrieval
from the book passage:
{document}
'''
)
a = batch.add_object(
properties = {'text': str(document)},
vector = response['embedding']
)
QueryReturn(objects=[Object(uuid=_WeaviateUUIDInt('0131ef1c-eb3a-4d40-91f8-751de84ffc11'), metadata=MetadataReturn(creation_time=None, last_update_time=None, distance=None, certainty=None, score=None, explain_score=None, is_consistent=None, rerank_score=None), properties={'text': 'OTHELLO How comes it, Michael, you are thus forgot? CASSIO I pray you pardon me; I cannot speak. OTHELLO Worthy Montano, you were wont be civil. The gravity and stillness of your youth The world hath noted. And your name is great In mouths of wisest censure. What’s the matter That you unlace your reputation thus, And spend your rich opinion for the name Of a night-brawler? Give me answer to it. MONTANO Worthy Othello, I am hurt to danger. Your officer Iago can inform you, While I spare speech, which something now offends me, Of all that I do know; nor know I aught By me that’s said or done amiss this night, Unless self-charity be sometimes a vice, And to defend ourselves it be a sin When violence assails us. OTHELLO Now, by heaven, My blood begins my safer guides to rule, And passion, having my best judgment collied, Assays to lead the way. Zounds, if I stir, Or do but lift this arm, the best of you Shall sink in my rebuke. Give me to know How this foul rout began, who set it on; And he that is approved in this offense, Though he had twinned with me, both at a birth, Shall lose me. What, in a town of war Yet wild, the people’s hearts brimful of fear, To manage private and domestic quarrel, In night, and on the court and guard of safety? ’Tis monstrous. Iago, who began ’t?'}, references=None, vector={}, collection='Books')]
Hello I'm working on a book information retrieval system. I have books as PDFs and I am using Ollama to generate embeddings and weaviate as vector store. In the vector store I made a multi-tenant collection called Books and each book is a tenant.
For brevity I will only use one tenant, which is Shakespeare's Othello as PDF.
Below I present the two functions I am using to create the vector store and to populate it. You can see that the vector store is being populated with the text as it should, however the vector itself is empty even when I set it in the batch.add_object method. You can check the example output at the end.
Here are the two functions I use:
And I'm calling it as follows:
An example output: