GoogleCloudPlatform / generative-ai

Sample code and notebooks for Generative AI on Google Cloud, with Gemini on Vertex AI
https://cloud.google.com/vertex-ai/docs/generative-ai/learn/overview
Apache License 2.0
6.88k stars 1.85k forks source link

Getting TypeError when accessing `language/use-cases/document-qa/question_answering_documentai_vector_store_palm.ipynb` #313

Open SampathkumarSubramaniam opened 9 months ago

SampathkumarSubramaniam commented 9 months ago

Hello, Getting below error when accessing https://github.com/GoogleCloudPlatform/generative-ai/blob/main/language/use-cases/document-qa/question_answering_documentai_vector_store_palm.ipynb

Exact code:

@retry(wait=wait_random_exponential(min=10, max=120), stop=stop_after_attempt(5))
def embedding_model_with_backoff(text=[]):
    **embeddings = embedding_model.get_embeddings(text)**
    return [each.values for each in embeddings][0]

Error log:

 File /lib/python3.11/site-packages/vertexai/language_models/_language_models.py", line 1724, in _prepare_text_embedding_request
    raise TypeError(f"Unsupported text embedding input type: {text}.")
TypeError: Unsupported text embedding input type: nan.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):

    pdf_data_sample["embedding"] =["chunks"].apply(
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Could you help to fix?

holtskinner commented 8 months ago

@SampathkumarSubramaniam Can you provide the input document that caused the issue? (With any PII redacted) This issue seems to be because the text sent to the embeddings model was empty

jwchoiKR commented 8 months ago

pdf_data_sample = pdf_data_sample.dropna(subset=['chunks'])