fkirc / attranslate

A command line tool for translating JSON, YAML, CSV, ARB, XML (via a CLI)
https://www.npmjs.com/package/attranslate
Other
336 stars 27 forks source link

SyntaxError: Unexpected end of JSON input #251

Open psarno opened 1 year ago

psarno commented 1 year ago

I have a properly formatted en.json file translation file in the following format:

{
  "LOGIN": {
    "FORGOT_USERNAME": "Forgot your username?",
    "FORGOT_PASSWORD": "Forgot your password?",
  },
  "CUSTOMERSEARCH": {
    "CONTACT_EMAIL": "Contact Email",
    "BILLING_EMAIL": "Billing Email",
   }
}

I ran the command as:

attranslate --srcFile=en.json --srcLng=English --srcFormat=nested-json --targetFile=fr.json --targetLng=French --service=openai --serviceConfig=[my OpenAI key here] --targetFormat=nested-json

This fails with:

SyntaxError: Unexpected end of JSON input at JSON.parse () at readRawJson (C:\Users\user\AppData\Roaming\npm\node_modules\attranslate\dist\file-formats\common\managed-json.js:30:31) at readManagedJson (C:\Users\user\AppData\Roaming\npm\node_modules\attranslate\dist\file-formats\common\managed-json.js:22:36) at NestedJson.readTFile (C:\Users\user\AppData\Roaming\npm\node_modules\attranslate\dist\file-formats\nested-json\nested-json.js:11:57) at readTFileCore (C:\Users\user\AppData\Roaming\npm\node_modules\attranslate\dist\core\core-util.js:43:34) at async resolveOldTarget (C:\Users\user\AppData\Roaming\npm\node_modules\attranslate\dist\core\translate-cli.js:20:16) at async translateCli (C:\Users\user\AppData\Roaming\npm\node_modules\attranslate\dist\core\translate-cli.js:72:23) error: Failed to parse 'C:\Users\user\source\repos\portal\portal\src\assets\i18n\fr.json'

I assume nested-json is proper here. I tried flat-json but only get:

error: Failed to parse 'C:\Users\user\source\repos\portal\portal\src\assets\i18n\en.json' with expected format 'flat-json': Property 'LOGIN' is not a string or null

fkirc commented 1 year ago

Hi, it seems to me that there is a problem with too many “,“ symbols. When testing with https://jsonlint.com/, the provided JSON seems to be non-compliant with the specification of JSON. The following should work (with less commas):

{ "LOGIN": { "FORGOT_USERNAME": "Forgot your username?", "FORGOT_PASSWORD": "Forgot your password?" }, "CUSTOMERSEARCH": { "CONTACT_EMAIL": "Contact Email", "BILLING_EMAIL": "Billing Email" } }

psarno commented 1 year ago

@fkirc I apologize, we don't actually have those trailing commas in the real en.json file we are dealing with. That was just a mistake on my part when copying and cutting out parts of it.

Here's an example of what the actual file loosks like. It passes linting.

image

image

fkirc commented 1 year ago

Unfortunately I am not able to reproduce the error with the shortened sample above, but I have a suspicion what the error might be. I think the error is in the file C:\Users\user\AppData\Roaming\npm\node_modules\attranslate\dist\util\util.js . If you open this file on your machine, then you will find the following function:

function readUtf8File(path) {
    checkNotDir(path);
    return (0, fs_1.readFileSync)(path, { encoding: "utf8", flag: "r" });
}

Now I suspect that this function does not work if the JSON is not utf8. According to ChatGPT, this could lead to garbled characters. Therefore, it would be great if you could open the file C:\Users\user\AppData\Roaming\npm\node_modules\attranslate\dist\util\util.js on your machine and change this function to something like this:

function readUtf8File(path) {
    checkNotDir(path);
    return (0, fs_1.readFileSync)(path, { flag: "r" });
}

Alternatively, you could send me the complete JSON such that I am able to reproduce it (if you are allowed to share the JSON from your company).

image

fkirc commented 1 year ago

The error looks like it would happen with a french JSON as target-file, but not with an english JSON as target-file (if you would hypothetically translate from english to english). In french, we have a problem that the same accents can be encoded in multiple different ways, so I suspect that the french JSON might not be utf8. It would be great if you could give me some snippets of the french JSON to reproduce and to fix it.

psarno commented 1 year ago

The error looks like it would happen with a french JSON as target-file, but not with an english JSON as target-file.

I only have an english JSON file, I do not have French or Spanish at all.

It did attempt to get through the French version, but gave up.

It started off rather promising ...

image

Then it threw in the towel, apparently.

image

Am I perhaps running up against an API limit here?

There are over 8,000 entries.

fkirc commented 1 year ago

An API-limit could be a problem at a later stage, although based on your error messages it seems that the problem happens at an earlier stage during parsing.

But for the „—„ entries: Those could actually be some weird errors from the OpenAI API. OpenAI has a few rough edges and I am not yet sure how to circumvent such OpenAI errors.

At the moment, I believe that Google Translate works more stable than OpenAI.