[x ] I have checked the documentation and related resources and couldn't resolve my bug.
Describe the bug
I have followed the documentation for prompt adaptation to Polish language. It seems to work fine, up to the point when I want to save them to file. It gives following error:
UnicodeEncodeError: 'charmap' codec can't encode character '\u017c' in position 6: character maps to <undefined>
This is related to Polish special character 'ż' in translated prompt, here is the start of the prompt: 'Co możesz mi powiedzieć o Albercie'.
Ragas version: 0.2.3
Python version: 3.13.0
Code to Reproduce
import os
from python.ragas.common.config import OpenAIConfig
from ragas.llms import LangchainLLMWrapper
from ragas.metrics import LLMContextPrecisionWithoutReference
from ragas.utils import RAGAS_SUPPORTED_LANGUAGE_CODES
from langchain_openai.chat_models import AzureChatOpenAI
os.environ['OPENAI_API_KEY'] = OpenAIConfig.api_key
scorer = LLMContextPrecisionWithoutReference()
scorer.get_prompts()
azure_llm = AzureChatOpenAI(
openai_api_version=OpenAIConfig.api_version,
azure_endpoint=OpenAIConfig.api_base,
azure_deployment=OpenAIConfig.ragas_model_deployment_name,
model=OpenAIConfig.ragas_model_deployment_name,
validate_base_url=False,
)
azure_llm = LangchainLLMWrapper(azure_llm)
adapted_prompts = await scorer.adapt_prompts(language="polish", llm=azure_llm)
scorer.set_prompts(**adapted_prompts)
scorer.save_prompts('../common/__data/')
Error trace
---------------------------------------------------------------------------
UnicodeEncodeError Traceback (most recent call last)
Cell In[7], line 1
----> 1 scorer.save_prompts('../common/__data/')
File ~\PycharmProjects\cbs-cx-chatbot\venv\Lib\site-packages\ragas\prompt\mixin.py:89, in PromptMixin.save_prompts(self, path)
84 for prompt_name, prompt in prompts.items():
85 # hash_hex = f"0x{hash(prompt) & 0xFFFFFFFFFFFFFFFF:016x}"
86 prompt_file_name = os.path.join(
87 path, f"{prompt_name}_{prompt.language}.json"
88 )
---> 89 prompt.save(prompt_file_name)
File ~\PycharmProjects\cbs-cx-chatbot\venv\Lib\site-packages\ragas\prompt\pydantic_prompt.py:338, in PydanticPrompt.save(self, file_path)
336 raise FileExistsError(f"The file '{file_path}' already exists.")
337 with open(file_path, "w") as f:
--> 338 json.dump(data, f, indent=2, ensure_ascii=False)
339 print(f"Prompt saved to {file_path}")
File ~\AppData\Local\Programs\Python\Python313\Lib\json\__init__.py:180, in dump(obj, fp, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
177 # could accelerate with writelines in some versions of Python, at
178 # a debuggability cost
179 for chunk in iterable:
--> 180 fp.write(chunk)
File ~\AppData\Local\Programs\Python\Python313\Lib\encodings\cp1252.py:19, in IncrementalEncoder.encode(self, input, final)
18 def encode(self, input, final=False):
---> 19 return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u017c' in position 6: character maps to <undefined>
Expected behavior
Prompt translations should be saved to the file as described in documentation.
Additional context
I've tried also on Python 3.12 with some older packages, result was the same.
[x ] I have checked the documentation and related resources and couldn't resolve my bug.
Describe the bug I have followed the documentation for prompt adaptation to Polish language. It seems to work fine, up to the point when I want to save them to file. It gives following error:
UnicodeEncodeError: 'charmap' codec can't encode character '\u017c' in position 6: character maps to <undefined>
This is related to Polish special character 'ż' in translated prompt, here is the start of the prompt: 'Co możesz mi powiedzieć o Albercie'.Ragas version: 0.2.3 Python version: 3.13.0
Code to Reproduce
Error trace
Expected behavior Prompt translations should be saved to the file as described in documentation.
Additional context I've tried also on Python 3.12 with some older packages, result was the same.