mangiucugna / json_repair

A python module to repair invalid JSON, commonly used to parse the output of LLMs
https://pypi.org/project/json-repair/
MIT License
826 stars 48 forks source link

[FEATURE] Adding [ensure_ascii] option to [repair_json] #60

Closed RatexMak closed 2 months ago

RatexMak commented 2 months ago

…on-ascii characters

Issue #


I was trying to repaire a json string return by llm, it returns in Chinese, and the response of [repair_json] is in unicode.

mangiucugna commented 2 months ago

Hi, thanks for providing a pull request. Can you provide a test case? I generally agree with aligning with all the options of the standard json library but I need something to add to the tests. I also need to add that option everywhere, so I will probably reject this PR and provide a more comprehensive change. Thank you in advance

mangiucugna commented 2 months ago

I have this test, is that what you would like to see?

print(repair_json("{'test_chinese_ascii':'统一码'}", ensure_ascii=True))
{"test_chinese_ascii": "\u7edf\u4e00\u7801"}
print(repair_json("{'test_chinese_ascii':'统一码'}", ensure_ascii=False))
{"test_chinese_ascii": "统一码"}
RatexMak commented 2 months ago

Sure! It seems good enough for my problem. Thanks a lot.

RatexMak commented 2 months ago

I wil just close all my pull request regarding this matter. You are doing a IMPRESSIVE job on this, It reeeally helps a lot.

mangiucugna commented 2 months ago

The new release has been published. Cheers