googleapis / langchain-google-alloydb-pg-python

Apache License 2.0
9 stars 9 forks source link

Unable to access the table from alloydb for further data processing #142

Closed jaybfn closed 4 weeks ago

jaybfn commented 1 month ago

I have a alloydb database for vector search, now I need to access the database using python and I am unable to do it using python. This is my starting code

from langchain_google_alloydb_pg import AlloyDBEngine

PROJECT_ID = "******"
REGION = "***"  
CLUSTER = "***"  
INSTANCE = "***"  
DATABASE = "***"  
PASSWORD = "***"
USER = '****'

engine = await AlloyDBEngine.afrom_instance(
    project_id=PROJECT_ID,
    region=REGION,
    cluster=CLUSTER,
    instance=INSTANCE,
    database=DATABASE,
)

Can anyone help me with it here?

jackwotherspoon commented 1 month ago

Hi there @jaybfn 😄

You are going to have to provide more details before we can help... such as what error are you currently seeing?

Please update the description with the stacktrace error you are encountering.

jackwotherspoon commented 4 weeks ago

@jaybfn I deleted your comment as it contained information such as your user and password, project id etc.

If you could please repost your comment with all sensitive information removed or replaced with ***** that would be great, thanks 😄

jaybfn commented 4 weeks ago

Hello, thank you for pointing out @jackwotherspoon, I am so sorry for it. So this is the complete code and error

from langchain_google_alloydb_pg import AlloyDBEngine
from langchain_google_vertexai import VertexAIEmbeddings
from langchain_google_alloydb_pg import AlloyDBEngine, AlloyDBVectorStore

import nest_asyncio
nest_asyncio.apply()

PROJECT_ID = "*****"
REGION = "****"
CLUSTER = "****"
INSTANCE = "******"
DATABASE = "*****"
PASSWORD = "*****"
USER = '*****'
TABLE_NAME = "*******"

engine = AlloyDBEngine.afrom_instance(
project_id=PROJECT_ID,
region=REGION,
cluster=CLUSTER,
instance=INSTANCE,
database=DATABASE,
)

embeddings = VertexAIEmbeddings(
model_name="textembedding-gecko@latest", project="*******")

engine = AlloyDBEngine.from_instance("*****", "*****", "*****", "*****", "*****", "*****","******")

vectorstore = AlloyDBVectorStore.create_sync(
engine,
table_name="*******",
embedding_service=embeddings
)

query = "What is RAG?"
query_vector = embeddings.embed_query(query)
docs = await vectorstore.asimilarity_search_by_vector(query_vector, k=3)
print(docs)
/tmp/ipykernel_1535713/1667853844.py:28: RuntimeWarning: coroutine 'AlloyDBEngine.afrom_instance' was never awaited
engine = AlloyDBEngine.from_instance("*******)
RuntimeWarning: Enable tracemalloc to get the object allocation traceback

RuntimeError Traceback (most recent call last)
Cell In[49], line 38
36 query = "What is RAG?"
37 query_vector = embeddings.embed_query(query)
---> 38 docs = await vectorstore.asimilarity_search_by_vector(query_vector, k=3)
39 print(docs)

File ~/projects/QuestionAnswerBot/llmapp/lib/python3.10/site-packages/langchain_google_alloydb_pg/alloydb_vectorstore.py:516, in AlloyDBVectorStore.asimilarity_search_by_vector(self, embedding, k, filter, **kwargs)
[509](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/jbfn/projects/QuestionAnswerBot/~/projects/QuestionAnswerBot/llmapp/lib/python3.10/site-packages/langchain_google_alloydb_pg/alloydb_vectorstore.py:509) async def asimilarity_search_by_vector(
[510](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/jbfn/projects/QuestionAnswerBot/~/projects/QuestionAnswerBot/llmapp/lib/python3.10/site-packages/langchain_google_alloydb_pg/alloydb_vectorstore.py:510) self,
[511](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/jbfn/projects/QuestionAnswerBot/~/projects/QuestionAnswerBot/llmapp/lib/python3.10/site-packages/langchain_google_alloydb_pg/alloydb_vectorstore.py:511) embedding: List[float],
(...)
[514](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/jbfn/projects/QuestionAnswerBot/~/projects/QuestionAnswerBot/llmapp/lib/python3.10/site-packages/langchain_google_alloydb_pg/alloydb_vectorstore.py:514) **kwargs: Any,
[515](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/jbfn/projects/QuestionAnswerBot/~/projects/QuestionAnswerBot/llmapp/lib/python3.10/site-packages/langchain_google_alloydb_pg/alloydb_vectorstore.py:515) ) -> List[Document]:
--> [516](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/jbfn/projects/QuestionAnswerBot/~/projects/QuestionAnswerBot/llmapp/lib/python3.10/site-packages/langchain_google_alloydb_pg/alloydb_vectorstore.py:516) docs_and_scores = await self.asimilarity_search_with_score_by_vector(
[517](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/jbfn/projects/QuestionAnswerBot/~/projects/QuestionAnswerBot/llmapp/lib/python3.10/site-packages/langchain_google_alloydb_pg/alloydb_vectorstore.py:517) embedding=embedding, k=k, filter=filter, **kwargs
[518](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/jbfn/projects/QuestionAnswerBot/~/projects/QuestionAnswerBot/llmapp/lib/python3.10/site-packages/langchain_google_alloydb_pg/alloydb_vectorstore.py:518) )
[520](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/jbfn/projects/QuestionAnswerBot/~/projects/QuestionAnswerBot/llmapp/lib/python3.10/site-packages/langchain_google_alloydb_pg/alloydb_vectorstore.py:520) return [doc for doc, _ in docs_and_scores]

File ~/projects/QuestionAnswerBot/llmapp/lib/python3.10/site-packages/langchain_google_alloydb_pg/alloydb_vectorstore.py:529, in AlloyDBVectorStore.asimilarity_search_with_score_by_vector(self, embedding, k, filter, **kwargs)
[522](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/jbfn/projects/QuestionAnswerBot/~/projects/QuestionAnswerBot/llmapp/lib/python3.10/site-packages/langchain_google_alloydb_pg/alloydb_vectorstore.py:522) async def asimilarity_search_with_score_by_vector(
[523](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/jbfn/projects/QuestionAnswerBot/~/projects/QuestionAnswerBot/llmapp/lib/python3.10/site-packages/langchain_google_alloydb_pg/alloydb_vectorstore.py:523) self,
[524](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/jbfn/projects/QuestionAnswerBot/~/projects/QuestionAnswerBot/llmapp/lib/python3.10/site-packages/langchain_google_alloydb_pg/alloydb_vectorstore.py:524) embedding: List[float],
(...)
...
--> [285](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/usr/lib/python3.10/asyncio/futures.py:285) yield self # This tells Task to wait for completion.
[286](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/usr/lib/python3.10/asyncio/futures.py:286) if not self.done():
[287](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/usr/lib/python3.10/asyncio/futures.py:287) raise RuntimeError("await wasn't used with future")

RuntimeError: Task <Task pending name='Task-49' coro=<InteractiveShell.run_cell_async() running at /home/jbfn/projects/QuestionAnswerBot/llmapp/lib/python3.10/site-packages/IPython/core/interactiveshell.py:3334> cb=[IPythonKernel._cancel_on_sigint..cancel_unless_done(<Future pendi...ernel.py:329]>)() at /home/jbfn/projects/QuestionAnswerBot/llmapp/lib/python3.10/site-packages/ipykernel/ipkernel.py:329, Task.task_wakeup()]> got Future attached to a different loop
jackwotherspoon commented 4 weeks ago

@jaybfn The issue you are running into is that you are mixing the synchronouse and async interfaces. You need to stick with one throughout, they can not be intermingled.

So either use all async (recommended) or all synchronous.

Async (recommended):

engine = await AlloyDBEngine.afrom_instance(
    project_id=PROJECT_ID,
    region=REGION,
    cluster=CLUSTER,
    instance=INSTANCE,
    database=DATABASE,
)

embeddings = VertexAIEmbeddings(
model_name="textembedding-gecko@latest", project="*******")

vectorstore = await AlloyDBVectorStore.create(
    engine,
    table_name="*******",
    embedding_service=embeddings
)

query = "What is RAG?"
query_vector = embeddings.embed_query(query)
docs = await vectorstore.asimilarity_search_by_vector(query_vector, k=3)
print(docs)

Sync:

engine = AlloyDBEngine.from_instance(
    project_id=PROJECT_ID,
    region=REGION,
    cluster=CLUSTER,
    instance=INSTANCE,
    database=DATABASE,
)

embeddings = VertexAIEmbeddings(
model_name="textembedding-gecko@latest", project="*******")

vectorstore = AlloyDBVectorStore.create_sync(
    engine,
    table_name="*******",
    embedding_service=embeddings
)

query = "What is RAG?"
query_vector = embeddings.embed_query(query)
docs = vectorstore.similarity_search_by_vector(query_vector, k=3)
print(docs)

Let me know if this solves you issue for you 😄

jaybfn commented 4 weeks ago

Hello @jackwotherspoon , thanks you so much, its working now.

jackwotherspoon commented 4 weeks ago

Hello @jackwotherspoon , thanks you so much, its working now.

Awesome to hear! Glad you were able to get it working 😄

Will close this issue now, feel free to open another if you run into future issues, we always appreciate feedback.