googleapis / python-aiplatform

A Python SDK for Vertex AI, a fully managed, end-to-end platform for data science and machine learning.
Apache License 2.0
615 stars 328 forks source link

Response are stopped with finishReason: BLOCKLIST way more often than expected #3622

Open sebastiancarlsson opened 4 months ago

sebastiancarlsson commented 4 months ago

As far as I know the BLOCKLIST finish reason is a fairly new feature and I've been experiencing a lot of stopped responses with this reason lately.

Unfortunately I do not have specific steps to reproduce this (as it seems a bit random) but it started happening when I upgraded to gemini-1.0-pro-002 and version 1.47.0 of the SDK.

The prompt I'm using is quite narrowly defined and has not produced any content that is even remotely offensive thus far (with gemini-1.0-pro-001). I found that it happens less with lower temperature settings (<0.2) which I suppose makes sense, but that's not a solution.

I'm using the GenerativeModel.generate_content_async method to generate content, not sure if that is relevant.

Ark-kun commented 4 months ago

This is not an SDK issue since the SDK just relays the service response to the user.

However we can forward the feedback to the model quality team. Can you please include the requests that re being blocked? (We understand that the reproduction rate is not 100%.)

drpebcak commented 4 months ago

I am also seeing this same issue with multiple different prompts. It seems to happen often, but if you retry the call over and over again, it does occasionally succeed. Here is one example:

GEMINI MESSAGES:  [role: "user"
parts {
  text: "\nYou are task oriented system.\nYou receive input from a user, process the input from the given instructions, and then output the result.\nYour objective is to provide consistent and correct results.\nYou do not need to explain the steps taken, only provide the result to the given instructions.\nYou are referred to as a tool.\nYou don\'t move to the next step until you have a result.\n\nGenerate an image of a a squirelle playing with acorn on a tree and return only the url of the image."
}
]

GEMINI TOOLS:  [function_declarations {
  name: "image-generation"
  description: "Generates images based on the specified parameters and returns a list of URLs to the generated images."
  parameters {
    type_: OBJECT
    properties {
      key: "size"
      value {
        type_: STRING
        description: "(optional) The size of the image to generate, format WxH (e.g. 1024x1024). Defaults to 1024x1024."
      }
    }
    properties {
      key: "quality"
      value {
        type_: STRING
        description: "(optional) The quality of the generated image. Allowed values are \"standard\" or \"hd\". Default is \"standard\"."
      }
    }
    properties {
      key: "prompt"
      value {
        type_: STRING
        description: "(required) The text prompt based on which the GPT model will generate a response"
      }
    }
    properties {
      key: "number"
      value {
        type_: STRING
        description: "(optional) The number of images to generate. Defaults to 1."
      }
    }
  }
}
]

The response from gemini comes back like this:

 candidates {
  finish_reason: BLOCKLIST
  safety_ratings {
    category: HARM_CATEGORY_HATE_SPEECH
    probability: NEGLIGIBLE
    probability_score: 0.108566426
    severity: HARM_SEVERITY_NEGLIGIBLE
    severity_score: 0.145357817
  }
  safety_ratings {
    category: HARM_CATEGORY_DANGEROUS_CONTENT
    probability: NEGLIGIBLE
    probability_score: 0.303318918
    severity: HARM_SEVERITY_LOW
    severity_score: 0.276983798
  }
  safety_ratings {
    category: HARM_CATEGORY_HARASSMENT
    probability: NEGLIGIBLE
    probability_score: 0.173988834
    severity: HARM_SEVERITY_NEGLIGIBLE
    severity_score: 0.160924256
  }
  safety_ratings {
    category: HARM_CATEGORY_SEXUALLY_EXPLICIT
    probability: NEGLIGIBLE
    probability_score: 0.176529601
    severity: HARM_SEVERITY_LOW
    severity_score: 0.204818651
  }
}
usage_metadata {
  prompt_token_count: 234
  total_token_count: 234
}
sebastiancarlsson commented 4 months ago

@Ark-kun thank you! I understand it's not the SDK but I'm not sure where else to turn. If there's a better place for me to post bugs like this please link me and I'll make a report.

Edit: And to answer your question, I get the same error as posted above. I can't post our prompt here because of secrecy reasons but I too am using a prompt combined with a single function declaration - maybe that has something to do with it.

BjornWest commented 4 months ago

I have also experience this issue recently, specifically from tool use with gemini-1.0-pro. I've tested sync and async versions extensively and have about the same error rate with both (about 50%). One thing I have noticed is that when I rerun failed attempts they often fail 6-7 times before succeeding, which might indicate that certain prompts are more prone to this issue, but they will eventually succeed.

I also came across quite a few "error 500: internal server error" during my testing which I can't recall ever seeing in other interactions with vertex. Interestingly, preview gemini-1.5 seems to not have this issue at all, at least from my testing.

Another thing I noticed is that if I set the max input tokens to an amount insufficient to make the function call, it will also return "finish_reason: BLOCKLIST" (and will do so no matter how often you rerun it) which makes it seem like its the standard message displayed when the model is attempting to make a function call but is unable to.

michalspiegel commented 4 months ago

Hi, same issue, my prompt also has a function declaration, that might be connected. I'll post it here if it helps (or forward me to a better place to post bugs):

code used in generation:

 vertexai.init(project=XXXXX)
 config = GenerationConfig(
            temperature=0.1,
            top_p=0.8,
            top_k=32,
            candidate_count=1,
            max_output_tokens=2048,
)
model = GenerativeModel("gemini-pro")
response = model.generate_content(prompt, generation_config=config)

The prompt:

<html> <body| <div| <img[A]| Magento Admin Panel > <ul| <span[B]| Dashboard > <li| <span[C]| Sales > <a| Braintree Virtual Terminal > > <span[D]| Catalog > <li| <span[E]| Customers > <a| Login as Customer Log > > <span[F]| Marketing > <span[G]| Content > <li| <span[H]| Reports > <strong| Business Intelligence > > <span[I]| Stores > <li| <span[J]| System > <a| Manage Encryption Key > > <span[K]| Find Partners & Extensions > > > <div| <div| <button| <span[L]| System Messages > <text| : > <text| 1 > > <div| <text| Failed to synchronize data to the Magento Business Intelligence service. > <a[M]| Retry Synchronization > > > <header| <h1| Dashboard > <div| <a| My Account <span[N]| admin > > <a[O]| Notifications > <label[P]| > > > <main| <div| <div| <span| Scope: > <button[Q]| All Store Views > <a[R]| What is this? What is this? > > <button[S]| Reload Data Reload Data > > <div| <section| <div| <header| Advanced Reporting > <div| Gain new insights and take command of your business' performance, using our dynamic product, order, and customer reports tailored to your customer data. > > <a| Go to Advanced Reporting <span[T]| Go to Advanced Reporting > > > <div| <div| <div| <ul| <a| Orders <span[U]| Orders > > <a| Amounts <span[V]| Amounts > > > <div| <label| Select Range: > <select[W]| <option| today > <option| 24h > <option| 7d > <option| 1m > <option| 1y > <option| 2y > > > <div| No Data Found > > <ul| <li| <span| Revenue > <span| $0.00 > > <li| <span| Tax > <span| $0.00 > > <li| <span| Shipping > <span| $0.00 > > <li| <span| Quantity > <strong| 0 > > > <ul| <li[X]| <a| Bestsellers Bestsellers > > <a[Y]| Most Viewed Products Most Viewed Products > <a[Z]| New Customers New Customers > <a[AA]| Customers Customers > > > <div| <div| <div| Lifetime Sales > <span| $0.00 > > <div| <div| Average Order > <span| $0.00 > > <div| <div| Last Orders > <table| <th| Customer > <th| Items > <th| Total > <tr| http://ec2-3-131-244-37.us-east-2.compute.amazonaws.com:7780/admin/sales/order/view/order_id/312/ <td| Div garg > <td| 1 > <td| $10.77 > > <tr| http://ec2-3-131-244-37.us-east-2.compute.amazonaws.com:7780/admin/sales/order/view/order_id/311/ <td| Div garg > <td| 1 > <td| $0.00 > > > > > > > > > > </html>

You are a helpful assistant that can assist with web navigation tasks.
You are given a simplified html webpage and a task description.
Your goal is to complete the task. You can use the provided functions below to interact with the current webpage.

#Provided functions:
def click(element_id: str) -> None:
    """
    Click on the element with the specified id.

    Args:
       element_id: The id of the element.
    """

def hover(element_id: str) -> None:
    """
    Hover on the element with the specified id.

    Args:
       element_id: The id of the element.
    """

def select(element_id: str, option: str) -> None:
 """
    Select an option from a dropdown.

    Args:
       element_id: The id of the element.
       option: Value of the option to select.
 """

def type_string(element_id: str, content: str, press_enter: bool) -> None:
 """
    Type a string into the element with the specified id.

    Args:
       element_id: The id of the element.
       content: The string to type.
       press_enter: Whether to press enter after typing the string.
 """

def scroll_page(direction: Literal['up', 'down']) -> None:
 """
    Scroll down/up one page.

    Args:
       direction: The direction to scroll.
 """

def go(direction: Literal['forward', 'backward']) -> None:
 """
    Go forward/backward

    Args:
       direction: The direction to go to.
 """

def jump_to(url: str, new_tab: bool) -> None:
 """
    Jump to the specified url.

    Args:
       url: The url to jump to.
       new_tab: Whether to open the url in a new tab.
 """

def switch_tab(tab_index: int) -> None:
 """
    Switch to the specified tab.

    Args:
       tab_index: The index of the tab to switch to.
 """

def user_input(message: str) -> str:
 """
    Wait for user input.

    Args:
       message: The message to display to the user.

    Returns: The user input.
 """

def finish(answer: Optional[str]) -> None:
 """
    Finish the task (optionally with an answer).

    Args:
       answer: The answer to the task.
 """

#Previous commands: None

#Window tabs: 1. Dashboard / Magento Admin <-- current tab

#Current viewport (pages): 0.0 / 2.2

#Task: What is the total count of Pending reviews amongst all the reviews?

You should output one command to interact to the currrent webpage.
You should add a brief comment to your command to explain your reasoning and thinking process.

The response:

candidates {
  finish_reason: BLOCKLIST
  safety_ratings {
    category: HARM_CATEGORY_HATE_SPEECH
    probability: NEGLIGIBLE
    probability_score: 0.12710987
    severity: HARM_SEVERITY_NEGLIGIBLE
    severity_score: 0.111821815
  }
  safety_ratings {
    category: HARM_CATEGORY_DANGEROUS_CONTENT
    probability: NEGLIGIBLE
    probability_score: 0.0421665385
    severity: HARM_SEVERITY_NEGLIGIBLE
    severity_score: 0.0503306314
  }
  safety_ratings {
    category: HARM_CATEGORY_HARASSMENT
    probability: NEGLIGIBLE
    probability_score: 0.103205055
    severity: HARM_SEVERITY_NEGLIGIBLE
    severity_score: 0.0671785772
  }
  safety_ratings {
    category: HARM_CATEGORY_SEXUALLY_EXPLICIT
    probability: NEGLIGIBLE
    probability_score: 0.0491307862
    severity: HARM_SEVERITY_NEGLIGIBLE
    severity_score: 0.107250325
  }
}
usage_metadata {
  prompt_token_count: 1467
  total_token_count: 1467
}

EDIT: same thing happens for gemini-1.5-pro-preview-0409

RoyLeviLangware commented 4 months ago

I am also experiencing unreasonable amounts of BLOCKLIST triggers. I am using gemini-1.5-pro-preview-0409 My input is mostly code, but here is the response:

candidates {
  finish_reason: BLOCKLIST
  safety_ratings {
    category: HARM_CATEGORY_HATE_SPEECH
    probability: NEGLIGIBLE
    probability_score: 0.0341648199
    severity: HARM_SEVERITY_NEGLIGIBLE
    severity_score: 0.0406167768
  }
  safety_ratings {
    category: HARM_CATEGORY_DANGEROUS_CONTENT
    probability: NEGLIGIBLE
    probability_score: 0.0505176708
    severity: HARM_SEVERITY_NEGLIGIBLE
    severity_score: 0.1538032
  }
  safety_ratings {
    category: HARM_CATEGORY_HARASSMENT
    probability: NEGLIGIBLE
    probability_score: 0.0624452867
    severity: HARM_SEVERITY_NEGLIGIBLE
    severity_score: 0.0730323866
  }
  safety_ratings {
    category: HARM_CATEGORY_SEXUALLY_EXPLICIT
    probability: NEGLIGIBLE
    probability_score: 0.0513675064
    severity: HARM_SEVERITY_NEGLIGIBLE
    severity_score: 0.0391216464
  }
}
usage_metadata {
  prompt_token_count: 71149
  candidates_token_count: 2
  total_token_count: 71151
}

LMK if anyone found a workaround :)

RoyLeviLangware commented 4 months ago

Another pretty harmless example blocked by gemini-experimental but not by gemini-1.5-pro-preview-0409

hiiiiii
tell me something longngggg

Seems like there is potential for this problem to escalate in the future

AsaoTouma commented 4 months ago

I would appreciate your opinion if you could tell me what was the final cause of the problem.

RoyLeviLangware commented 4 months ago

Still no solution on my side, hoping a model update would fix the issue

Stono commented 4 months ago

also experiencing this

RoyLeviLangware commented 4 months ago

My understanding is that a server-side fix is on the way, Ill update if I hear anything

satellite-xyz commented 3 months ago

There were a few times where my prompt included "OpenAI" and this issue occurred. When replacing "OpenAI" with "GlobeAI," then the exact same prompt worked.

oscarYCL commented 3 months ago

I am using gemini-1.0-pro-001/gemini-1.0-pro-002 for generating the output. The successful is around 50%. if fail, the error is "The candidate is likely blocked by the safety filters" Strangely, I haven't encountered this issue for the past two weeks, and it only appeared today. The sdk version is "google-cloud-aiplatform==1.50.0".

python code:

import base64
import vertexai
from vertexai.generative_models import GenerativeModel, Part, FinishReason
import vertexai.preview.generative_models as generative_models

def generate(question):
    vertexai.init(project="xxxxxxxxxx", location="northamerica-northeast1")
    model = GenerativeModel("gemini-1.0-pro-002")
    generation_config = {
        "max_output_tokens": 2048,
        "temperature": 0.3,
        "top_p": 0.9,
        "top_k": 40,
    }
    safety_settings = {
        generative_models.HarmCategory.HARM_CATEGORY_HATE_SPEECH: generative_models.HarmBlockThreshold.BLOCK_NONE,
        generative_models.HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: generative_models.HarmBlockThreshold.BLOCK_NONE,
        generative_models.HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT: generative_models.HarmBlockThreshold.BLOCK_NONE,
        generative_models.HarmCategory.HARM_CATEGORY_HARASSMENT: generative_models.HarmBlockThreshold.BLOCK_NONE,
    }
    responses = model.generate_content(
        [question],
        generation_config=generation_config,
        safety_settings=safety_settings,
        stream=True,
    )
    for response in responses:
        print(response.text, end="")

with open("question.txt", "r") as file:
    question = file.read()

generate(question)
ValueError: Cannot get the response text.
Cannot get the Candidate text.
Response candidate content has no parts (and thus no text). The candidate is likely blocked by the safety filters.
Content:
{}
Candidate:
{
  "finish_reason": "BLOCKLIST",
  "safety_ratings": [
    {
      "category": "HARM_CATEGORY_HATE_SPEECH",
      "probability": "NEGLIGIBLE",
      "probability_score": 0.17539679,
      "severity": "HARM_SEVERITY_NEGLIGIBLE",
      "severity_score": 0.13251455
    },
    {
      "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
      "probability": "NEGLIGIBLE",
      "probability_score": 0.4181622,
      "severity": "HARM_SEVERITY_LOW",
      "severity_score": 0.289673
    },
    {
      "category": "HARM_CATEGORY_HARASSMENT",
      "probability": "NEGLIGIBLE",
      "probability_score": 0.18952107,
      "severity": "HARM_SEVERITY_NEGLIGIBLE",
      "severity_score": 0.10302442
    },
    {
      "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
      "probability": "NEGLIGIBLE",
      "probability_score": 0.11338311,
      "severity": "HARM_SEVERITY_NEGLIGIBLE",
      "severity_score": 0.100878626
    }
  ]
}

If the job fail, it can generate the responses, but it will block after generated the responses

YashMishra1234 commented 3 months ago

Facing similar issue. Is there any update on this?

RoyLeviLangware commented 3 months ago

Just received an update that a fix has been successfully rolled out. I recommend checking again.

YashMishra1234 commented 3 months ago

I tried again. The issue is still persisting

benjaminye commented 3 months ago

I'm getting the same issue this morning with gemini-1.5-pro-preview-0514. Switched endpoint to gemini-1.5-flash-preview-0514 and it ran fine. Production system has been running for a month and this is the first time it has happened.

Output:

candidates {
  finish_reason: BLOCKLIST
  safety_ratings {
    category: HARM_CATEGORY_HATE_SPEECH
    probability: NEGLIGIBLE
    probability_score: 0.106318869
    severity: HARM_SEVERITY_LOW
    severity_score: 0.237045661
  }
  safety_ratings {
    category: HARM_CATEGORY_DANGEROUS_CONTENT
    probability: NEGLIGIBLE
    probability_score: 0.0880331174
    severity: HARM_SEVERITY_NEGLIGIBLE
    severity_score: 0.155847773
  }
  safety_ratings {
    category: HARM_CATEGORY_HARASSMENT
    probability: NEGLIGIBLE
    probability_score: 0.164650738
    severity: HARM_SEVERITY_NEGLIGIBLE
    severity_score: 0.174129218
  }
  safety_ratings {
    category: HARM_CATEGORY_SEXUALLY_EXPLICIT
    probability: NEGLIGIBLE
    probability_score: 0.174129218
    severity: HARM_SEVERITY_LOW
    severity_score: 0.247623369
  }
}
usage_metadata {
  prompt_token_count: 28281
  total_token_count: 28281
}
stijntratsaertit commented 3 months ago

I've had the same issue. I can't disclose the prompting/data but it was about an organization and didn't contain any harmful or related words/sentences. Using SDK 1.51.0 & gemini-1.0-pro-001