stanfordnlp / dspy

DSPy: The framework for programming—not prompting—foundation models
https://dspy-docs.vercel.app/
MIT License
18.54k stars 1.43k forks source link

UnicodeEncodeError: 'ascii' codec can't encode character '\u2019' in position 58: ordinal not in range(128) #1336

Closed spidercatfly closed 2 months ago

spidercatfly commented 3 months ago

Awesome job on the project!

However, I encountered an issue while following the "Minimal Working Example". When I run the following code:

optimized_cot = teleprompter.compile(CoT(), trainset=gsm8k_trainset)

I receive the following error:

UnicodeEncodeError: 'ascii' codec can't encode character '\u2019' in position 58: ordinal not in range(128)

The detailed error message is:

(input_keys={'question'}) with <function gsm8k_metric at 0x7f0873e0cd30> due to 'ascii' codec can't encode character '\u2019' in position 58: ordinal not in range(128). [dspy.teleprompt.bootstrap] filename=bootstrap.py lineno=211

Could you please help me resolve this issue?

Thanks for your assistance!

arnavsinghvi11 commented 2 months ago

Hi @spidercatfly , this is not DSPy related, but you can try changing the encoding to support unicode/UTF-8 to fix this within the data in the trainset.