Closed objecttothis closed 1 year ago
Also, currently only en-US language files in #3271 has been ported to the CI4 format. We need to figure out if we have to convert the rest of them by hand or if weblate can do this for us.
@objecttothis aren't the language files already associative arrays at this point? We have a lang variable (array) with key and assign a value to it. The syntax you show here will probably yield the same result?
I could look into this but preferably I'll try to match with what is in weblate as the current format nicely works.
@objecttothis aren't the language files already associative arrays at this point? We have a lang variable (array) with key and assign a value to it. The syntax you show here will probably yield the same result?
Sorry, you are correct. There are two things to consider though:
return $lang;
to the bottom of each file. Can we customize Weblate's generator to do this?@objecttothis I need to take some time to see how the file load mechanism works exactly. I suppose in the past CI was just loading one file and thus overwriting the keys with new values every time a language was selected.
Do we keep the MY_Lang extension that we had in the old install or do we need to modify it? If we keep it then we can probably bootstrap the existing files and then we're also certain it will work with weblate. I wouldn't really worry too much about the CI convention here, as long as the strings go where they should.
I'll be off coming week on a hiking trip, so I might not be able to look into this until after then.
From https://codeigniter.com/user_guide/installation/upgrade_4xx.html
https://codeigniter.com/user_guide/installation/upgrade_localization.html gives the details of upgrading localization. The extensions of localizations that we did in 3.x is one of the things that I am unfamiliar with. I'm certainly not opposed to extending and overriding CI4 functionality, but I don't think it's a good idea to do it just to avoid code rewrites because it opens the software up to difficulty upgrading in the future and potentially vulnerability patches that CI4 makes in x area that we aren't taking advantage of because we've overridden functionality in that area.
That said, I haven't yet looked at porting the various framework extensions that we've made, so I guess we need to create an issue for that one too.
@objecttothis Good point indeed, let's not extend it unless there is no other way. I'll have to look in to the customization to see what it does exactly and will then check how to accomplish the same behavior in CI4.
An easy fix for the language files would be to add an array declaration at the top of all the files, $lang = []
and a return $lang
at the end.
@objecttothis Good point indeed, let's not extend it unless there is no other way. I'll have to look in to the customization to see what it does exactly and will then check how to accomplish the same behavior in CI4.
An easy fix for the language files would be to add an array declaration at the top of all the files,
$lang = []
and areturn $lang
at the end.
OK, let's see what ends up being needed. I converted all the files for en-US by hand already to the CI4 format, but that can be reverted if we find that Weblate cannot easily output that format for all the other languages and it's possible to use the CI3 formatting in CI4. It sounds like CI3 formatting could be made to work with your suggestion here.
I'lll try to have a look later this week.
@objecttothis it seems the CI4 format is also supported by weblate
That's great news.
I can try to convert the files and then import them in weblate.. but first I'd like to get the tests running so we can deploy the branch.
That is reasonable. By convert the files, they should already be in CI4 format... Or do you mean convert as in import them into weblate? Getting tests working first is fine. I haven't done much this week on it because I'm having to put a fire out with the website.
[cli tool](http://docs.translatehouse.org/projects/translate-toolkit/en/latest/commands/php2po.html)
and it also exists the other way around.. but perhaps it might be faster to parse the files in a small php script and dump them back in the new format.I'll use this bash oneliner to do the format conversion with the files from master. Below the convert.php snippet that it uses. It parses the array inside current language files and echoes it back in the new format.
for file in app/Language/**/*.php; do f=$(basename $file); name=${f^}; output=$(dirname $file)/${name%_lang.php}.php; docker run -ti -v "$(pwd):/root" php:cli-alpine -c php /root/convert.php /root/$file > $output; done
<?php
include($argv[1]);
echo "<?php\n";
echo "return [\n";
foreach ($lang as $key=>$value) {
echo "\t'$key' => '$value',\n";
}
echo "];";
?>
I have pushed the format conversions for this now. Next step is to try and import a component with the new format in weblate. Perhaps some upgrade will be needed, let's see how this goes.
I tried to import the converted files in weblate but bumped into an issue, I used single quotes in the converted files, which caused some imports to fail. Updated now to double quotes. Will check if this works now.
@objecttothis I have imported the CI4 converted strings for the Common component here now: https://translate.opensourcepos.org/projects/opensourcepos/ci4-common/
This was done to test if weblate can parse the new format. What remains for this story is updating the language filenames in the exisiting components after we merge the branch to master. I can do this directly in the weblate postgres. Then we can let weblate reindex the new files.
@jekkos In my troubleshooting I was just noticing good news and bad news. The good news is that the CI4 language loader is working as expected. The bad news is that in the manipulations that you were working on to get weblate working with the new format that weblate overwrote the language line names with the original names but the new array format. See the screenshot in the first post of this issue for example. The CI format was
This is what it currently looks like:
This is what it should look like:
Notice that the prefixes got added back in? The code is calling lang('Attributes.confirm_delete');
so it's expecting the line in the language file to be "confirm_delete" but it's actually "attributes_confirm_delete". In order for the code to work, we need to get the have the keys match. There are two ways of doing that:
If we do the first, that means two things: Project-wide find/replace for lang('Attributes.
and then replace that with lang('Attributes.attributes_
and repeat for each of the language files. the second is that our calls to lang become, IMO, unnecessarily redundant (e.g., lang('Attributes.attributes_confirm_delete')
instead of lang('Attributes.confirm_delete')
)
If we do the second that means two things: find/replace in the app/Language/ folder for "attributes_
and replace it with just "
then repeat for each of the other prefixes found in language files of various languages. The second is that we need to figure out how to get that message to weblate so that it doesn't generate or look for the language strings with those prefixes.
I prefer to do option 2 because of the simplicity of the lang() calls and how easy the code is to change, but I also don't know how much work that is to do on the weblate side.
btw, I think the reason that weblate was having issues with the single quotes is because of the string interpolation. on the PHP side. You may notice the 2nd image has escape characters for single quotes. Weblate may not have been handling those correctly. In the end if we have to use double-quotes, it will call the parser for each key and value, but I don't imagine it will destroy the performance of the application that much since the arrays don't usually show up in long-running loops.
Also, string interpolation of localization calls is slightly different in CI4. For example:
[/application/language/en-US/login_lang.php]
$lang["login_welcome"] = "Welcome to %1!";
[/application/views/login.php]
<?php echo $this->lang->line('login_welcome', $this->lang->line('common_software_short')); ?>
Becomes
[/app/language/en-US/Login.php]
"welcome" => "Welcome to {install_name}",
[/app/views/login.php]
<?php echo lang('Login.welcome', ['install_name' => lang('Common.software_short')]) ?>
We can alternately do
[/app/language/en-US/Login.php]
"welcome" => "Welcome to {0}",
[/app/views/login.php]
<?php echo lang('Login.welcome', [lang('Common.software_short')]) ?>
I kind of like the associated array version, but don't feel incredibly strongly about it. Doing associated arrays could make it clearer to the translator what the placeholder is standing for, but it could also cause people to translate what's in the placeholders unless they are reminded not to translate anything inside of { }.
@jekkos after we make a decision on which of the two options to use for string interpolation, we will need to make the changes in these language files. I believe I had made those changes before but they were overwritten. I don't know how to make the changes in the code so that weblate picks them up so can you either do it, or let me know what changes I can make so that weblate picks it up. I'm guessing all the languages will need to have changes because I'm pretty the %1, etc is scattered through all of them.
BTW, CI4 lang() offers some cool functionality in the interpolation.
in the language files you can specify in the placeholder the type of value you are sending it and formatting.
In the lang() call you can send the locale (dynamically) which causes things like prices to be displayed with the correct currency symbol, side, etc. Perhaps something for a future enhancement. I think for this change we only need the interpolation as is in the correct format.
Option 2 sounds like the right one. My hunch is that before you needed this prefix because the language array was only loaded for one language, and switching between languages would then just overwrite the values for the keys you loaded in memory. On the weblate side wel need to reindex the new keys, I don't think that should be an issue, the filenames are already different and that also needs to be changed once we merge.
Good that the interpolation options are also better now. Let's stick with what we have first and we can change this eventually later on if needed.
btw great work on fixing the language loading, I can look into the key changes later this week.
btw great work on fixing the language loading, I can look into the key changes later this week.
Thank you. Once the branch has the updated key names, it should work as is. Out of curiosity, for future changes to language strings, does Weblate automatically pick up additions and deletions if they are placed in the en-US (master) language, or do you have to manually pull in changes?
They are indeed picked up from the master language component. So if the file is changed it will be indexed by weblate.
They are indeed picked up from the master language component. So if the file is changed it will be indexed by weblate.
OK, I pushed the corrections to the string names in en-US. I also pushed the change from %1, %2 to {0}, {1}, but someone will likely need to not only push those changes to the other translations since those translations are done, but we will need to revisit these to add in the ICU references for numbers, prices, etc.
@jekkos can you confirm that weblate properly picked up the changes in en-US and applied the new string names?
I have updated the repo but now there seems to be an issue
I have updated the repo but now there seems to be an issue
It looks to me like it picked up the new key names but created them as blank keys and kept the old keys so you end up with duplicates. It makes sense that weblate wouldn't be able to automatically figure out that these are the same strings with different names. I'm not sure the best way to approach this, but potentially one option would be to mass change the key names in all languages (not that difficult with PHPStorm) then see if we can clear weblate's database and have it hoover up all the files from all the languages from the ci4-upgrade branch. Can weblate do that? Then we could go back to what we were doing before with it automatically picking up changes to en-US. Thoughts?
I'll try to rerun the import again this weekend, see how that works out
I have synced the language files again with master, but should get rid of the conflicts. After that I'll try and import again in weblate.
The sync files are no longer CI4 lang files.
Ah how do you mean? The structure is the same no? I took latest version from master and converted them to CI4 format. I found out what the issue is with the weblate keys @objecttothis . There is a bug in the laravel string parser that is causing the wrong keys to be imported: https://github.com/translate/translate/pull/4728 . A solution was in the works there, I could try to patch the weblate install and see if that resolves the issue. Or another option might be to switch to gettext format. It should be better supported by weblate, but then I need to check if CI4 also supports it natively.
Here is some of the previous CI4 file structure. return [ "address" => "Company Address", "address_required" => "Company address is a required field.", "all_set" => "All file permissions are set correctly!", "allow_duplicate_barcodes" => "Allow Duplicate Barcodes", "apostrophe" => "apostrophe", "backup_button" => "Backup", "backup_database" => "Backup Database", "barcode" => "Barcode", "barcode_company" => "Company Name", "barcode_configuration" => "Barcode Configuration", "barcode_content" => "Barcode Content",
The following is what showed up in my merge this morning CI3 structure.
return [ "config_address" => "Company Address", "config_address_required" => "Company address is a required field.", "config_all_set" => "All file permissions are set correctly!", "config_allow_duplicate_barcodes" => "Allow Duplicate Barcodes", "config_apostrophe" => "apostrophe", "config_backup_button" => "Backup", "config_backup_database" => "Backup Database", "config_barcode" => "Barcode", "config_barcode_company" => "Company Name", "config_barcode_configuration" => "Barcode Configuration", "config_barcode_content" => "Barcode Content", "config_barcode_first_row" => "Row 1", "config_barcode_font" => "Font", "config_barcode_formats" => "Input Formats",
Ok I see the keys are incorrect, I still need to remove the first part, didn't notice that. Thanks for pointing that out.
I have adjusted the prefixes now. Also noticed that in expense_categories the prefix removal was not consistent before. I should check if this works as it should in that screen.
Ok happy to announce that the import has worked now, after patching weblate with the code from the PR. I have imported now only the common translations from the ci4-branch, this can be viewed here
Ok happy to announce that the import has worked now, after patching weblate with the code from the PR. I have imported now only the common translations from the ci4-branch, this can be viewed here
Thanks for the hard work on this. The formatting looks correct. It looks like some changes I made to the base en-US translation got overwritten somewhere along the way.
CI4 has built in interpolation (https://codeigniter.com/user_guide/outgoing/localization.html?highlight=localization#replacing-parameters) but it's in the form of sending an array. For example, the login view has
and the language file has
The former way was adding %1 and it looks like they got reverted to that.
@jekkos thoughts?
As with the old method, anyone translating will need to not modify any placeholders.
We would also need to think through how to find/replace existing translations with the proper placeholders. I would think that the easiest would be to do it in the code and have weblate scoop it up for all languages... Alternately we could walk through weblate and do one language at a time. This method would prevent issues with merging changes made on weblate vs what got pushed from the code.
Actually what I did is copy all changes from master to the branch. I forgot that you modified the english translations. I will reinstantiate them and then try to spot any differences.
Actually what I did is copy all changes from master to the branch. I forgot that you modified the english translations. I will reinstantiate them and then try to spot any differences.
Oh, I see. I believe I only made changes to en-US and it should just be the ones with interpolation.
I have reverted the en-US files now, but saw that eg expenses_categories content was the same, but that the keys in your version still contained the categories
prefix. I have removed this one now.
I have now also added the scripts to the repo, next to the ones that @owlbrudder made. This will allow us to convert changes submitted in master to the new format, as long as they keep coming in through weblate.
All language submissions will remain on master for now. Before we merge the ci4 branch we will do a final sync. After that we'll need to change the file format in weblate to Laravel. This has proven to work with the import of the ci4-common component.
Weblate should be setup to receive translation contributions on master.. seems the translation type was not set correctly yet
Issue / Bug / Question / New Feature
Unlike CI3, CI4 ranges translation phrases into an associative array. I'm not sure of the ingest/export mechanism of Weblate but we may need to tweak the configuration so that translation files get generated properly.