Generating RAG responses with content containing emojis was failing with error such as
UnicodeEncodeError: 'utf-8' codec can't encode characters in position XX-XX: surrogates not allowed
The bug is in LiteLLM package..
Using Vertex AI directly with LiteLLM package within core_backend works fine with emojis.
The problem is using the openai/ API interface via LiteLLM proxy server.
Initial debugging info
Click to view initial debugging info
The request body sent to VertexAI SDK looked like
```text
{messages': [{'content': 'REFERENCE TEXT:\n0. How to handle swollen FEET\nIt is normal to experience swollen feet\ud83e\uddb6 and legs while pregnant.', 'role': 'system'}, {'content': 'My feet are swollen', 'role': 'user'}], 'model': 'generate-gemini-response', 'max_tokens': 1024, 'response_format': {'type': 'json_object'}, 'temperature': 0}
```
Then, we turned the debugging on for litellm proxy to see where the exception was thrown. It was getting thrown in the Vertex AI python SDK:
```
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/litellm/llms/vertex_ai.py", line 964, in async_completion
response = await llm_model._generate_content_async(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/vertexai/generative_models/_generative_models.py", line 524, in _generate_content_async
request = self._prepare_request(
^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/vertexai/generative_models/_generative_models.py", line 274, in _prepare_request
contents = [
^
File "/usr/local/lib/python3.11/site-packages/vertexai/generative_models/_generative_models.py", line 275, in
gapic_content_types.Content(content_dict) for content_dict in contents
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/proto/message.py", line 609, in __init__
pb_value = marshal.to_proto(pb_type, value)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/proto/marshal/marshal.py", line 211, in to_proto
return type(value)(self.to_proto(proto_type, i) for i in value)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/proto/marshal/marshal.py", line 211, in
return type(value)(self.to_proto(proto_type, i) for i in value)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/proto/marshal/marshal.py", line 228, in to_proto
pb_value = self.get_rule(proto_type=proto_type).to_proto(value)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/proto/marshal/rules/message.py", line 36, in to_proto
return self._descriptor(**value)
^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'utf-8' codec can't encode characters in position 1121-1122: surrogates not allowed
```
Goal
Allow emojis in content.
Changes
Since the problem is to do with how LiteLLM Proxy handles request data, we are implementing a temporary patch where core_backend calls Vertex AI endpoint directly, via litellm package.
This PR changes
docker-compose.yml so that core_backend can use Vertex AI via LiteLLM directly.
deploy_gcp_core_backend.yml so that we apply the same changes to deployed app
Future Tasks (optional)
[ ] raise issue on litellm
How has this been tested?
Replaced testing branch with this branch to test the deployment.
Checklist
Fill with x for completed.
[x] My code follows the style guidelines of this project
[x] I have reviewed my own code to ensure good quality
[x] I have tested the functionality of my code to ensure it works as intended
[x] I have resolved merge conflicts
(Delete any items below that are not relevant)
[x] I have updated the requirements
[x] I have updated the CI/CD scripts in .github/workflows/
Reviewer: @lickem22 Estimate: 30 mins
Ticket
Fixes: https://idinsight.atlassian.net/browse/AAQ-751
Description
Generating RAG responses with content containing emojis was failing with error such as
The bug is in LiteLLM package..
core_backend
works fine with emojis.openai/
API interface via LiteLLM proxy server.Initial debugging info
Click to view initial debugging info
The request body sent to VertexAI SDK looked like ```text {messages': [{'content': 'REFERENCE TEXT:\n0. How to handle swollen FEET\nIt is normal to experience swollen feet\ud83e\uddb6 and legs while pregnant.', 'role': 'system'}, {'content': 'My feet are swollen', 'role': 'user'}], 'model': 'generate-gemini-response', 'max_tokens': 1024, 'response_format': {'type': 'json_object'}, 'temperature': 0} ``` Then, we turned the debugging on for litellm proxy to see where the exception was thrown. It was getting thrown in the Vertex AI python SDK: ``` Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/litellm/llms/vertex_ai.py", line 964, in async_completion response = await llm_model._generate_content_async( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/vertexai/generative_models/_generative_models.py", line 524, in _generate_content_async request = self._prepare_request( ^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/vertexai/generative_models/_generative_models.py", line 274, in _prepare_request contents = [ ^ File "/usr/local/lib/python3.11/site-packages/vertexai/generative_models/_generative_models.py", line 275, inGoal
Allow emojis in content.
Changes
Since the problem is to do with how LiteLLM Proxy handles request data, we are implementing a temporary patch where
core_backend
calls Vertex AI endpoint directly, vialitellm
package.This PR changes
docker-compose.yml
so thatcore_backend
can use Vertex AI via LiteLLM directly.deploy_gcp_core_backend.yml
so that we apply the same changes to deployed appFuture Tasks (optional)
How has this been tested?
Replaced
testing
branch with this branch to test the deployment.Checklist
Fill with
x
for completed.(Delete any items below that are not relevant)
.github/workflows/