squattingmonk / nasher

A build tool for Neverwinter Nights projects
MIT License
42 stars 17 forks source link

i18n: When converting to and from with windows-1250 and windows-1252 code pages, one will be wrong #119

Open julien-lecomte opened 6 months ago

julien-lecomte commented 6 months ago

If you have both French (windows-1252) and Polish (windows-1250) translations, only one can be correctly re-encoded when packing or unpacking a module.

For example, with an object having "Armure d'écailles" (French), and "Zbroja Łuskowa" (Polish), if you pack it, the Polish will be wrong in the toolset ("Zbroja Łuskowa"). If you unpack it, the language will be still wrong in the json after this opposite action ("Zbroja Å<81>uskowa", shown as in vim).

In order to have correct PL in the toolset, and back in the json, different gffFlags from the other languages should be used. This could be done by default.

Since most languages share the windows-1252, a flag such as --gffFlagsPL would be useful. It would be only used when converting Polish strings.

squattingmonk commented 6 months ago

Does --gffFlags="--nwn-encoding windows-1250" --erfFlags="--nwn-encoding windows-1250" not work? Or are you asking for a shorthand for that?

julien-lecomte commented 6 months ago

They work, but the Polish encoding is different from the others. I can only specify one encoding that applies to all languages. So either I end up with only Polish encoded garbage, or I end up with garbage for all languages a part from Polish.

An easy test: create a new module, add one object, with French name with an accent (é), and Polish name with a Ł. I currently can't pack the module, edit it and then unpack it without having at least one language become garbage.

squattingmonk commented 6 months ago

Ah, I see. Is this something that can actually be fixed by nasher, or does it require an upstream feature request to neverwinter.nim? By which I mean, is there already a way to do this with the neverwinter.nim tools that nasher just can't take advantage of right now?

julien-lecomte commented 6 months ago

For packing tlk files, nwnGff does the job and we can specify the correct encoding.

For nasher, I believe it would be treating the polish for 4 different fields (LocName, Description, ...) and applying a different encoding to them when packing/unpacking.

squattingmonk commented 6 months ago

Nasher just calls nwn_gff to do the conversion, so what command do you need to pass to nwn_gff to make it work?

julien-lecomte commented 6 months ago

How about just "--gffFlagsPL=" that passes these flags to nwn_gff only for Polish ? By defaut, if absent, it otherwise uses "--gffFlags".

tinygiant98 commented 6 months ago

Are you asking about applying encoding per-entry in a language/translation list for cexolocstrings?