Open psarno opened 1 year ago
Hi, it seems to me that there is a problem with too many “,“ symbols. When testing with https://jsonlint.com/, the provided JSON seems to be non-compliant with the specification of JSON. The following should work (with less commas):
{ "LOGIN": { "FORGOT_USERNAME": "Forgot your username?", "FORGOT_PASSWORD": "Forgot your password?" }, "CUSTOMERSEARCH": { "CONTACT_EMAIL": "Contact Email", "BILLING_EMAIL": "Billing Email" } }
@fkirc I apologize, we don't actually have those trailing commas in the real en.json file we are dealing with. That was just a mistake on my part when copying and cutting out parts of it.
Here's an example of what the actual file loosks like. It passes linting.
Unfortunately I am not able to reproduce the error with the shortened sample above, but I have a suspicion what the error might be.
I think the error is in the file C:\Users\user\AppData\Roaming\npm\node_modules\attranslate\dist\util\util.js
.
If you open this file on your machine, then you will find the following function:
function readUtf8File(path) {
checkNotDir(path);
return (0, fs_1.readFileSync)(path, { encoding: "utf8", flag: "r" });
}
Now I suspect that this function does not work if the JSON is not utf8.
According to ChatGPT, this could lead to garbled characters.
Therefore, it would be great if you could open the file C:\Users\user\AppData\Roaming\npm\node_modules\attranslate\dist\util\util.js
on your machine and change this function to something like this:
function readUtf8File(path) {
checkNotDir(path);
return (0, fs_1.readFileSync)(path, { flag: "r" });
}
Alternatively, you could send me the complete JSON such that I am able to reproduce it (if you are allowed to share the JSON from your company).
The error looks like it would happen with a french JSON as target-file, but not with an english JSON as target-file (if you would hypothetically translate from english to english). In french, we have a problem that the same accents can be encoded in multiple different ways, so I suspect that the french JSON might not be utf8. It would be great if you could give me some snippets of the french JSON to reproduce and to fix it.
The error looks like it would happen with a french JSON as target-file, but not with an english JSON as target-file.
I only have an english JSON file, I do not have French or Spanish at all.
It did attempt to get through the French version, but gave up.
It started off rather promising ...
Then it threw in the towel, apparently.
Am I perhaps running up against an API limit here?
There are over 8,000 entries.
An API-limit could be a problem at a later stage, although based on your error messages it seems that the problem happens at an earlier stage during parsing.
But for the „—„ entries: Those could actually be some weird errors from the OpenAI API. OpenAI has a few rough edges and I am not yet sure how to circumvent such OpenAI errors.
At the moment, I believe that Google Translate works more stable than OpenAI.
I have a properly formatted en.json file translation file in the following format:
I ran the command as:
attranslate --srcFile=en.json --srcLng=English --srcFormat=nested-json --targetFile=fr.json --targetLng=French --service=openai --serviceConfig=[my OpenAI key here] --targetFormat=nested-json
This fails with:
I assume
nested-json
is proper here. I triedflat-json
but only get: