opensourcepos / opensourcepos

Open Source Point of Sale is a web based point of sale application written in PHP using CodeIgniter framework. It uses MySQL as the data back end and has a Bootstrap 3 based user interface.
http://www.opensourcepos.org
Other
3.46k stars 2.19k forks source link

CI4: Modify Weblate config to work with CI4 style language strings. #3468

Closed objecttothis closed 1 year ago

objecttothis commented 2 years ago

Issue / Bug / Question / New Feature

Unlike CI3, CI4 ranges translation phrases into an associative array. I'm not sure of the ingest/export mechanism of Weblate but we may need to tweak the configuration so that translation files get generated properly.

image

objecttothis commented 2 years ago

Also, currently only en-US language files in #3271 has been ported to the CI4 format. We need to figure out if we have to convert the rest of them by hand or if weblate can do this for us.

jekkos commented 2 years ago

@objecttothis aren't the language files already associative arrays at this point? We have a lang variable (array) with key and assign a value to it. The syntax you show here will probably yield the same result?

jekkos commented 2 years ago

I could look into this but preferably I'll try to match with what is in weblate as the current format nicely works.

objecttothis commented 2 years ago

@objecttothis aren't the language files already associative arrays at this point? We have a lang variable (array) with key and assign a value to it. The syntax you show here will probably yield the same result?

Sorry, you are correct. There are two things to consider though:

  1. In order to keep the CI3 styling we need to add return $lang;to the bottom of each file. Can we customize Weblate's generator to do this?
  2. By not adopting CI4 styling we could face hiccups later. In the end it's not a big deal because both are valid associative arrays, but it won't match CI documentation from here on out.

1 Is the only "must do" I think. If it's a lot of work to switch the array notation style in weblate then I'm ok with leaving it and adding the return statement. Let me know what you want to do.

jekkos commented 2 years ago

@objecttothis I need to take some time to see how the file load mechanism works exactly. I suppose in the past CI was just loading one file and thus overwriting the keys with new values every time a language was selected.

Do we keep the MY_Lang extension that we had in the old install or do we need to modify it? If we keep it then we can probably bootstrap the existing files and then we're also certain it will work with weblate. I wouldn't really worry too much about the CI convention here, as long as the strings go where they should.

I'll be off coming week on a hiking trip, so I might not be able to look into this until after then.

objecttothis commented 2 years ago

From https://codeigniter.com/user_guide/installation/upgrade_4xx.html image image

https://codeigniter.com/user_guide/installation/upgrade_localization.html gives the details of upgrading localization. The extensions of localizations that we did in 3.x is one of the things that I am unfamiliar with. I'm certainly not opposed to extending and overriding CI4 functionality, but I don't think it's a good idea to do it just to avoid code rewrites because it opens the software up to difficulty upgrading in the future and potentially vulnerability patches that CI4 makes in x area that we aren't taking advantage of because we've overridden functionality in that area.

That said, I haven't yet looked at porting the various framework extensions that we've made, so I guess we need to create an issue for that one too.

jekkos commented 2 years ago

@objecttothis Good point indeed, let's not extend it unless there is no other way. I'll have to look in to the customization to see what it does exactly and will then check how to accomplish the same behavior in CI4.

An easy fix for the language files would be to add an array declaration at the top of all the files, $lang = [] and a return $lang at the end.

objecttothis commented 2 years ago

@objecttothis Good point indeed, let's not extend it unless there is no other way. I'll have to look in to the customization to see what it does exactly and will then check how to accomplish the same behavior in CI4.

An easy fix for the language files would be to add an array declaration at the top of all the files, $lang = [] and a return $lang at the end.

OK, let's see what ends up being needed. I converted all the files for en-US by hand already to the CI4 format, but that can be reverted if we find that Weblate cannot easily output that format for all the other languages and it's possible to use the CI3 formatting in CI4. It sounds like CI3 formatting could be made to work with your suggestion here.

jekkos commented 2 years ago

I'lll try to have a look later this week.

jekkos commented 2 years ago

@objecttothis it seems the CI4 format is also supported by weblate

jekkos commented 2 years ago

Screenshot from 2022-10-19 21-09-41

objecttothis commented 2 years ago

That's great news.

jekkos commented 2 years ago

I can try to convert the files and then import them in weblate.. but first I'd like to get the tests running so we can deploy the branch.

objecttothis commented 2 years ago

That is reasonable. By convert the files, they should already be in CI4 format... Or do you mean convert as in import them into weblate? Getting tests working first is fine. I haven't done much this week on it because I'm having to put a fire out with the website.

jekkos commented 2 years ago

3205 apparantly we can download the translation files in gnu gettext format as well, directly from weblate. Alas it can't be done in batch. One can only download a zip with all translations in one go. Found this [cli tool](http://docs.translatehouse.org/projects/translate-toolkit/en/latest/commands/php2po.html) and it also exists the other way around.. but perhaps it might be faster to parse the files in a small php script and dump them back in the new format.

jekkos commented 1 year ago

I'll use this bash oneliner to do the format conversion with the files from master. Below the convert.php snippet that it uses. It parses the array inside current language files and echoes it back in the new format.

for file in app/Language/**/*.php; do f=$(basename $file); name=${f^}; output=$(dirname $file)/${name%_lang.php}.php; docker run -ti -v "$(pwd):/root" php:cli-alpine -c php /root/convert.php /root/$file > $output; done
<?php

include($argv[1]);

echo "<?php\n";
echo "return [\n";
foreach ($lang as $key=>$value) {
    echo "\t'$key' => '$value',\n";
}
echo "];";
?>
jekkos commented 1 year ago

I have pushed the format conversions for this now. Next step is to try and import a component with the new format in weblate. Perhaps some upgrade will be needed, let's see how this goes.

jekkos commented 1 year ago

I tried to import the converted files in weblate but bumped into an issue, I used single quotes in the converted files, which caused some imports to fail. Updated now to double quotes. Will check if this works now.

jekkos commented 1 year ago

@objecttothis I have imported the CI4 converted strings for the Common component here now: https://translate.opensourcepos.org/projects/opensourcepos/ci4-common/

This was done to test if weblate can parse the new format. What remains for this story is updating the language filenames in the exisiting components after we merge the branch to master. I can do this directly in the weblate postgres. Then we can let weblate reindex the new files.

objecttothis commented 1 year ago

@jekkos In my troubleshooting I was just noticing good news and bad news. The good news is that the CI4 language loader is working as expected. The bad news is that in the manipulations that you were working on to get weblate working with the new format that weblate overwrote the language line names with the original names but the new array format. See the screenshot in the first post of this issue for example. The CI format was

This is what it currently looks like: image

This is what it should look like: image

Notice that the prefixes got added back in? The code is calling lang('Attributes.confirm_delete'); so it's expecting the line in the language file to be "confirm_delete" but it's actually "attributes_confirm_delete". In order for the code to work, we need to get the have the keys match. There are two ways of doing that:

  1. Do a mass find/replace in the code for each call to each language line so that it reads what is currently in the language files.
  2. Do a mass find/replace to strip the prefix from each key.

If we do the first, that means two things: Project-wide find/replace for lang('Attributes. and then replace that with lang('Attributes.attributes_ and repeat for each of the language files. the second is that our calls to lang become, IMO, unnecessarily redundant (e.g., lang('Attributes.attributes_confirm_delete') instead of lang('Attributes.confirm_delete'))

If we do the second that means two things: find/replace in the app/Language/ folder for "attributes_ and replace it with just " then repeat for each of the other prefixes found in language files of various languages. The second is that we need to figure out how to get that message to weblate so that it doesn't generate or look for the language strings with those prefixes.

I prefer to do option 2 because of the simplicity of the lang() calls and how easy the code is to change, but I also don't know how much work that is to do on the weblate side.

objecttothis commented 1 year ago

btw, I think the reason that weblate was having issues with the single quotes is because of the string interpolation. on the PHP side. You may notice the 2nd image has escape characters for single quotes. Weblate may not have been handling those correctly. In the end if we have to use double-quotes, it will call the parser for each key and value, but I don't imagine it will destroy the performance of the application that much since the arrays don't usually show up in long-running loops.

objecttothis commented 1 year ago

Also, string interpolation of localization calls is slightly different in CI4. For example:

[/application/language/en-US/login_lang.php]
$lang["login_welcome"] = "Welcome to %1!";

[/application/views/login.php]
<?php echo $this->lang->line('login_welcome', $this->lang->line('common_software_short')); ?>

Becomes

[/app/language/en-US/Login.php]
"welcome" => "Welcome to {install_name}",

[/app/views/login.php]
<?php echo lang('Login.welcome', ['install_name' => lang('Common.software_short')]) ?>

We can alternately do

[/app/language/en-US/Login.php]
"welcome" => "Welcome to {0}",

[/app/views/login.php]
<?php echo lang('Login.welcome', [lang('Common.software_short')]) ?>

I kind of like the associated array version, but don't feel incredibly strongly about it. Doing associated arrays could make it clearer to the translator what the placeholder is standing for, but it could also cause people to translate what's in the placeholders unless they are reminded not to translate anything inside of { }.

objecttothis commented 1 year ago

@jekkos after we make a decision on which of the two options to use for string interpolation, we will need to make the changes in these language files. I believe I had made those changes before but they were overwritten. I don't know how to make the changes in the code so that weblate picks them up so can you either do it, or let me know what changes I can make so that weblate picks it up. I'm guessing all the languages will need to have changes because I'm pretty the %1, etc is scattered through all of them.

BTW, CI4 lang() offers some cool functionality in the interpolation.

https://codeigniter.com/user_guide/outgoing/localization.html?highlight=localization#replacing-parameters

in the language files you can specify in the placeholder the type of value you are sending it and formatting.

In the lang() call you can send the locale (dynamically) which causes things like prices to be displayed with the correct currency symbol, side, etc. Perhaps something for a future enhancement. I think for this change we only need the interpolation as is in the correct format.

jekkos commented 1 year ago

Option 2 sounds like the right one. My hunch is that before you needed this prefix because the language array was only loaded for one language, and switching between languages would then just overwrite the values for the keys you loaded in memory. On the weblate side wel need to reindex the new keys, I don't think that should be an issue, the filenames are already different and that also needs to be changed once we merge.

Good that the interpolation options are also better now. Let's stick with what we have first and we can change this eventually later on if needed.

jekkos commented 1 year ago

btw great work on fixing the language loading, I can look into the key changes later this week.

objecttothis commented 1 year ago

btw great work on fixing the language loading, I can look into the key changes later this week.

Thank you. Once the branch has the updated key names, it should work as is. Out of curiosity, for future changes to language strings, does Weblate automatically pick up additions and deletions if they are placed in the en-US (master) language, or do you have to manually pull in changes?

jekkos commented 1 year ago

They are indeed picked up from the master language component. So if the file is changed it will be indexed by weblate.

objecttothis commented 1 year ago

They are indeed picked up from the master language component. So if the file is changed it will be indexed by weblate.

OK, I pushed the corrections to the string names in en-US. I also pushed the change from %1, %2 to {0}, {1}, but someone will likely need to not only push those changes to the other translations since those translations are done, but we will need to revisit these to add in the ICU references for numbers, prices, etc.

objecttothis commented 1 year ago

@jekkos can you confirm that weblate properly picked up the changes in en-US and applied the new string names?

jekkos commented 1 year ago

I have updated the repo but now there seems to be an issue

Screenshot from 2022-12-10 10-32-40

objecttothis commented 1 year ago

I have updated the repo but now there seems to be an issue

Screenshot from 2022-12-10 10-32-40

It looks to me like it picked up the new key names but created them as blank keys and kept the old keys so you end up with duplicates. It makes sense that weblate wouldn't be able to automatically figure out that these are the same strings with different names. I'm not sure the best way to approach this, but potentially one option would be to mass change the key names in all languages (not that difficult with PHPStorm) then see if we can clear weblate's database and have it hoover up all the files from all the languages from the ci4-upgrade branch. Can weblate do that? Then we could go back to what we were doing before with it automatically picking up changes to en-US. Thoughts?

jekkos commented 1 year ago

I'll try to rerun the import again this weekend, see how that works out

jekkos commented 1 year ago

I have synced the language files again with master, but should get rid of the conflicts. After that I'll try and import again in weblate.

odiea commented 1 year ago

The sync files are no longer CI4 lang files.

jekkos commented 1 year ago

Ah how do you mean? The structure is the same no? I took latest version from master and converted them to CI4 format. I found out what the issue is with the weblate keys @objecttothis . There is a bug in the laravel string parser that is causing the wrong keys to be imported: https://github.com/translate/translate/pull/4728 . A solution was in the works there, I could try to patch the weblate install and see if that resolves the issue. Or another option might be to switch to gettext format. It should be better supported by weblate, but then I need to check if CI4 also supports it natively.

odiea commented 1 year ago

Here is some of the previous CI4 file structure. return [ "address" => "Company Address", "address_required" => "Company address is a required field.", "all_set" => "All file permissions are set correctly!", "allow_duplicate_barcodes" => "Allow Duplicate Barcodes", "apostrophe" => "apostrophe", "backup_button" => "Backup", "backup_database" => "Backup Database", "barcode" => "Barcode", "barcode_company" => "Company Name", "barcode_configuration" => "Barcode Configuration", "barcode_content" => "Barcode Content",

The following is what showed up in my merge this morning CI3 structure.

return [ "config_address" => "Company Address", "config_address_required" => "Company address is a required field.", "config_all_set" => "All file permissions are set correctly!", "config_allow_duplicate_barcodes" => "Allow Duplicate Barcodes", "config_apostrophe" => "apostrophe", "config_backup_button" => "Backup", "config_backup_database" => "Backup Database", "config_barcode" => "Barcode", "config_barcode_company" => "Company Name", "config_barcode_configuration" => "Barcode Configuration", "config_barcode_content" => "Barcode Content", "config_barcode_first_row" => "Row 1", "config_barcode_font" => "Font", "config_barcode_formats" => "Input Formats",

jekkos commented 1 year ago

Ok I see the keys are incorrect, I still need to remove the first part, didn't notice that. Thanks for pointing that out.

jekkos commented 1 year ago

I have adjusted the prefixes now. Also noticed that in expense_categories the prefix removal was not consistent before. I should check if this works as it should in that screen.

jekkos commented 1 year ago

Ok happy to announce that the import has worked now, after patching weblate with the code from the PR. I have imported now only the common translations from the ci4-branch, this can be viewed here

objecttothis commented 1 year ago

Ok happy to announce that the import has worked now, after patching weblate with the code from the PR. I have imported now only the common translations from the ci4-branch, this can be viewed here

Thanks for the hard work on this. The formatting looks correct. It looks like some changes I made to the base en-US translation got overwritten somewhere along the way.

CI4 has built in interpolation (https://codeigniter.com/user_guide/outgoing/localization.html?highlight=localization#replacing-parameters) but it's in the form of sending an array. For example, the login view has image

and the language file has image

The former way was adding %1 and it looks like they got reverted to that.

@jekkos thoughts?

objecttothis commented 1 year ago

As with the old method, anyone translating will need to not modify any placeholders.

objecttothis commented 1 year ago

We would also need to think through how to find/replace existing translations with the proper placeholders. I would think that the easiest would be to do it in the code and have weblate scoop it up for all languages... Alternately we could walk through weblate and do one language at a time. This method would prevent issues with merging changes made on weblate vs what got pushed from the code.

jekkos commented 1 year ago

Actually what I did is copy all changes from master to the branch. I forgot that you modified the english translations. I will reinstantiate them and then try to spot any differences.

objecttothis commented 1 year ago

Actually what I did is copy all changes from master to the branch. I forgot that you modified the english translations. I will reinstantiate them and then try to spot any differences.

Oh, I see. I believe I only made changes to en-US and it should just be the ones with interpolation.

jekkos commented 1 year ago

I have reverted the en-US files now, but saw that eg expenses_categories content was the same, but that the keys in your version still contained the categories prefix. I have removed this one now.

jekkos commented 1 year ago

I have now also added the scripts to the repo, next to the ones that @owlbrudder made. This will allow us to convert changes submitted in master to the new format, as long as they keep coming in through weblate.

jekkos commented 1 year ago

All language submissions will remain on master for now. Before we merge the ci4 branch we will do a final sync. After that we'll need to change the file format in weblate to Laravel. This has proven to work with the import of the ci4-common component.

jekkos commented 2 months ago

Weblate should be setup to receive translation contributions on master.. seems the translation type was not set correctly yet