google-gemini / generative-ai-python

The official Python library for the Google Gemini API
https://pypi.org/project/google-generativeai/
Apache License 2.0
1.44k stars 279 forks source link

Gemini API returns 500 with special characters #575

Open johnnyheineken opened 7 hours ago

johnnyheineken commented 7 hours ago

Description of the bug:

This exact code:

text = "čku“ s hodnotami ANO a NE   + „"
model = genai.GenerativeModel("gemini-1.5-flash")
result = model.generate_content(
    f"Can you summarize this text?\n{text}"
)
print(f"{result.text=}")

returns

InternalServerError: 500 An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting

Other malfunctioning strings:

'muláře \t- chybně vyplněný formulář \t- ji'

The problem seems to be with

U+0009 : <control> CHARACTER TABULATION [TAB] {horizontal tabulation (HT); tab}

Actual vs expected behavior:

Actual: Returns 500

Expected: Using Czech letters & Czech punctuation will not return 500 and the text will be summarised using Gemini, as requested

Any other information you'd like to share?

Windows 10,


sys.version_info(major=3, minor=9, micro=13, releaselevel='final', serial=0)
google-ai-generativelanguage             0.6.10
google-api-core                          2.17.1
google-api-python-client                 2.119.0
google-auth                              2.25.2
google-auth-httplib2                     0.2.0
google-cloud-aiplatform                  1.49.0
google-cloud-bigquery                    3.17.2
google-cloud-core                        2.4.1
google-cloud-resource-manager            1.12.2
google-cloud-storage                     2.14.0
google-crc32c                            1.5.0
google-generativeai                      0.8.2
google-resumable-media                   2.7.0
google-search-results                    2.4.2
googleapis-common-protos                 1.62.0
grpc-google-iam-v1                       0.13.0
langchain-google-genai                   0.0.6
Hamza-nabil commented 6 hours ago

It does work with gemini-1.5-flash-002 the last stable version of Gemini 1.5 Flash (see Model versions and lifecycle).

code :

text = "čku“ s hodnotami ANO a NE   + „"
model = genai.GenerativeModel("gemini-1.5-flash-002")
result = model.generate_content(
    f"Can you summarize this text?\n{text}"
)
print(f"{result.text}")

Output :

The text "čku“ s hodnotami ANO a NE" translates from Czech to English as  "check" with values YES and NO.  It describes a binary check or a boolean variable.
johnnyheineken commented 5 hours ago

But the docs you sent are saying that I also should be using 002, according to this:

image

However I confirm that using "gemini-1.5-flash-002" works.

I also pinpointed the issue to U+0009 : <control> CHARACTER TABULATION [TAB] {horizontal tabulation (HT); tab}