microsoft / TypeChat

TypeChat is a library that makes it easy to build natural language interfaces using types.
https://microsoft.github.io/TypeChat/
MIT License
8.25k stars 391 forks source link

Fix/trailing commas from chatgpt #239

Open faceleg opened 7 months ago

faceleg commented 7 months ago

ChatGPT sometimes returns JSON with trailing spaces, which breaks the parser. The repair attempts do not take this into account.

I've copied in the strip trailing comma function from here: https://github.com/nokazn/strip-json-trailing-commas/blob/main/src/index.ts (MIT) and added a test to prove it works:

Trailing commas are stripped here: https://github.com/microsoft/TypeChat/blob/7c6837444ded6c3ebf88020050d2d0434c049c51/typescript/src/typechat.ts#L140

Example broken response:

{
  "items": [
    {
      "id": 1,
      "text": "驳回",
      "exampleSentences": [
        "法官驳回了他的上诉请求。",
        "公司决定驳回他的辞职申请。",
        "政府部门驳回了他的建议。",
      ],
      "partsOfSpeech": "verb"
    },
    {
      "id": 2,
      "text": "驳回",
      "exampleSentences": [
        "他对这个提案的驳回感到失望。",
        "这个决定的驳回引起了公众的不满。",
        "他的建议被驳回了,让他感到沮丧。",
      ],
      "partsOfSpeech": "noun"
    }
  ]
}

Example prompt that generated this response:

You are a helpful vocabulary learning assistant who helps users generate example sentences in Mandarin for language learning. You understand that in Mandarin, words can serve different parts of speech depending on context.

Please find the possible usages this word: 休想, and generate 3 example sentences for each usage.

The sentences should be medium or longer length and complexity of HSK5 or higher. Each sentence must contain the the word. All sentences provided for the word must be unique.

JSON must be returned as an array of objects, with one object per part of speech for the word. You must return valid JSON. The array of sentences must not have a trailing comma.

This is the project I'm using TypeChat on: https://github.com/faceleg/ankiai, forked from https://github.com/mhujer/ankiai.

faceleg commented 7 months ago

@microsoft-github-policy-service agree