GoogleCloudPlatform / Open_Data_QnA

The Open Data QnA python library enables you to chat with your databases by leveraging LLM Agents on Google Cloud. Open Data QnA enables a conversational approach to interacting with your data by implementing state-of-the-art NL2SQL / Text2SQL methods.
Apache License 2.0
102 stars 35 forks source link

HarmBlockThreshold.BLOCK_NONE breaks embeddings creation #32

Open rafi-rr opened 2 months ago

rafi-rr commented 2 months ago

When running python env_setup.py (either v1 or v2), the flow stops due to the following error:

google.api_core.exceptions.InvalidArgument: 400 User has requested a restricted HarmBlockThreshold setting BLOCK_NONE. You can get access either (a) through an allowlist via your Google account team, or (b) by switching your account type to monthly invoiced billing via this instruction: https://cloud.google.com/billing/docs/how-to/invoiced-billing.

This can be resolved by setting HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE but should either be avoided or allow to be configured.

Thanks.

rafi-rr commented 2 months ago

Adding an example for such an exception during the embeddings creation flow:

"File \"/.../lib/python3.11/site-packages/vertexai/generative_models/_generative_models.py\", line 1977, in text",
"    1974  @property",
"    1975  def text(self) -> str:",
"    1976      try:",
"--> 1977          return self.content.text",
"    1978      except (ValueError, AttributeError) as e:",
"    ..................................................",
"     self = finish_reason: SAFETY",
"            safety_ratings {",
"              category: HARM_CATEGORY_HATE_SPEECH",
"              probability: NEGLIGIBLE",
"              probability_score: 0.124023438",
"              severity: HARM_SEVERITY_NEGLIGIBLE",
"              severity_score: 0.15234375",
"            }",
"            safety_ratings {",
"              category: HARM_CATEGORY_DANGEROUS_CONTENT",
"              probability: HIGH",
"              blocked: true",
"              probability_score: 0.87890625",
"              severity: HARM_SEVERITY_LOW",
"              severity_score: 0.3046875",
"            }",
"            safety_ratings {",
"              category: HARM_CATEGORY_HARASSMENT",
"              probability: NEGLIGIBLE",
"              probability_score: 0.1621...",
"     self.content.text = # ValueError",
"          self.content = ",

The content of the tables far from being harmful or dangerous.

5Y5TEM commented 2 weeks ago

This should be resolved, as safety filters should be unblocked by default as of now. @rafi-rr can you verify?

rafi-rr commented 3 days ago

@5Y5TEM Still not resolved in v2.0.0 Happened again in a new project that doesn't have the security filters disabled. It fails during initialization with "InvalidArgument: 400 User has requested a restricted HarmBlockThreshold setting BLOCK_NONE."