Closed bubeck closed 1 month ago
I've tried it out and noticed 2 things:
msg_index
s are off by one. The first index should be 0.#ifdef LV_I18N_OPTIMIZE
const char * singulars;
#else
should be
#ifdef LV_I18N_OPTIMIZE
const char ** singulars;
#else
Thanks for the feedback. Awkward, as I tried it out in my project and then cleaned up for PR and obviously made these mistakes. I will come up with an updated PR.
No problem :)
Problems fixed, and rebased to current master. Removed lodash. Ready for inclusion or feedback.
I don't understand how should it work. Existing C template is 100% working example, regenerated with npm run template_update
. In your commit index seems to be missed (at least, could not find it for current demo data).
lv_i18n.template.c
is included in the PR (and it is changed to deal with LV_I18N_OPTIMIZE.
Instead of lv_i18n.h
I am using lv_i18n.template.h
, as this also has to be generated (because it depends on --optimize or not).
I also added tests into test/c/Makefile
, where you can see, how it is used. There is a test for --optimize and without --optimize.
What exactly does not work or is unclear?
1.
This is header template. It seems to miss index for demo data. Idea of this script is to fill template with demo info from demo data. That helps to understand data structs, verify generator and so on.
Suffix "_optimized" (lv_i18n_get_text_optimized
) is not descriptive (it does not explain what happens). I'd suggest something like lv_i18n_get_text_by_idx
While I think even single extra option is "too much", there are 2 new options introduced - --raw-header
& --optimize
. Could you explain how those are expected to be used in real world?
I'd suggest to avoid dynamically generated sources if possible. Because that complicates data structs understanding and debug. That means:
LV_I18N_OPTIMIZE
(0/1), LV_I18N_OPTIMIZE_LEVEL
(0/1/2) and so on [names are for example only]).lv_i18n compile
run in Makefile as part of test build (if possible).In short - let's move from "coding" to "data design". As first step, i'd like to understand new data structs (including indexes). So:
.h
& .c
contains full demo info, and can be used "as is" after simple copy & rename.As soon as data structs clear for optimized mode, we will see how to use the same data for non-optimized mode, if possible. After data design complete, it would be easy to update code without extra reworks (i will help with js if needed).
Let me know if you see things different or have any questions.
First answers
will do
2. Suffix "_optimized" (`lv_i18n_get_text_optimized`) is not descriptive (it does not explain what happens). I'd suggest something like `lv_i18n_get_text_by_idx`
will do
3. While I think even single extra option is "too much", there are 2 new options introduced - `--raw-header` & `--optimize`. Could you explain how those are expected to be used in real world?
The real world example is currently: use --optimize
to generate optimized version with integer index. This must be generated into the header file (e.g. lv_i18n.h
) as it must be present at compile time of the _()
.
If you want to also use --raw
then I need a second filename of the header file to be generated. I wanted to use --raw
but this is only for the c file. I could remove --raw-header
and derive the name of the header file from the name given for --raw
by replacing the extension from .c
to .h
. Would that be better?
If you like, we can remove option --optimize
and always generate normal and optimized version. The developer can decide which version he want be defining #define LV_I18N_OPTIMIZE 1
outside the generated code before including lv_i18n.h
. Would you prefer this?
4. I'd suggest to avoid dynamically generated sources if possible. Because that complicates data structs understanding and debug. That means: * Make sure all templates are filled with demo data * Try to generate the same files for all modes and select desired mode with guards (`LV_I18N_OPTIMIZE` (0/1), `LV_I18N_OPTIMIZE_LEVEL` (0/1/2) and so on [names are for example only]). * Avoid `lv_i18n compile` run in Makefile as part of test build (if possible).
will do, if I get the above proposals from you.
In short - let's move from "coding" to "data design". As first step, i'd like to understand new data structs (including indexes). So:
* Please make sure `.h` & `.c` contains full demo info, and can be used "as is" after simple copy & rename. * Please make sure, all data edge cases are covered (missed base phrase, missed translation, missed plural form translation, different number of plurals in languages, and so on...). If needed - update demo data source.
The real world example is currently: use
--optimize
to generate optimized version with integer index. This must be generated into the header file (e.g.lv_i18n.h
) as it must be present at compile time of the_()
.If you want to also use
--raw
then I need a second filename of the header file to be generated. I wanted to use--raw
but this is only for the c file. I could remove--raw-header
and derive the name of the header file from the name given for--raw
by replacing the extension from.c
to.h
. Would that be better?
I mean, as user of this package, I don't care about internals. I just need to "rebuild translations" in one step. From this point of view, introducing a lot of options just confuse the user with unnecessary details.
Output of this package is "set of i18n files", not separate .h
and separate .c
file. Those should be generated simultaneously.
Also note, in far future, this package can be extended with writer for another language (Rust
and so on). So, option should not be languange-specific when possible.
If you like, we can remove option
--optimize
and always generate normal and optimized version. The developer can decide which version he want be defining#define LV_I18N_OPTIMIZE 1
outside the generated code before includinglv_i18n.h
. Would you prefer this?
Yes, if that's technically possible. Because different modes can require different data layouts. That's why I suggested to start with update of templates, to verify everything about data and minimise rework.
Next round. Please give me feedback and directions.
The following is done:
lv_i18n_get_text_optimized()
to lv_i18n_get_text_by_idx()
--raw-header
. Instead of specifying a header file name for raw, it is derived from the filename given to --raw
by changing extension to .h
.template_update
so that it updates templates for header file as well,The template is now generated for mode "not optimized" as this was the previous mode. If you want, you can regenerate with --optimize
by changing template_update
.
Question: What is missing, what should be changed?
Sorry, was busy today. If you have time, please add "optimized" templates near current ones, temporary. All I need to verify/rework data structs - template files. If running of script needed - that requires to remember how that works :)
No Problem. Here we are. This is the code also generated for test/c.
Sorry for delay, see summary below
IMO it would be more useable, if the i18n compiler always generates 2 files (.c/.h), containing both optimized and slow versions (Selected by LV_I18N_OPTIMIZE guard). The content of each file could look like this:
// common part
#if LV_I18N_OPTIMIZE = 1
// optimize-specific part
#else
// slow-specific part
#endif
Plurals still use the slow version. IMO, they should have a separate index, with the same principle as singular.
Data struct for each language plural can be:
{
plural_forms_amount: N // depends on language
list_of strings: {
plural1_form1,
plural1_form3,
plural1_form4,
plural2_form1,
plural2_form3,
plural2_form4,
plural3_form1,
plural3_form3,
plural3_form4,
}
}
Or, even more straightforward - always reserve room for all 6 forms. Overhead will not be significant. I do not insist on any concrete implementation, only explain how to reorganize files in a more structured form.
Please note that data design and usability are the most important things for writing code without rework. For this stage, it's enough to write .c/.h templates manually, without coding at all (but I do not insist if you are ok with core rewrites).
\
at the end).LV_I18N_OPTIMIZE
is not set, well-known compilers can be auto-detected (not sure if that's a good idea).next round, please feedback
#define LV_I18N_OPTIMIZE
before including lv_i18n.h
. LGTM in general.
The only note about internals - please avoid data/code mixture if possible https://github.com/lvgl/lv_i18n/blob/a64d2baea7a0d8e7a5d9351b457f4682fa028fb8/test/c/combined-sample/lv_i18n.c#L61-L65. It's more clear to define structs separate for each mode.
Ideally would be to have 1 section for optimized and 1 section for slow variant per file. That will simplify understanding and next modifications.
Please add indexed plurals to example.
. It's more clear to define structs separate for each mode.
You mean something like
static const lv_i18n_lang_t en_gb_lang = {
.locale_name = "en-GB",
.singulars = en_gb_singulars,
.plurals[LV_I18N_PLURAL_TYPE_ONE] = en_gb_plurals_one,
.plurals[LV_I18N_PLURAL_TYPE_OTHER] = en_gb_plurals_other,
.locale_plural_fn = en_gb_plural_fn
};
static const lv_i18n_lang_indexed_t en_gb_lang_indexed = {
.locale_name = "en-GB",
.singulars_indexed = en_gb_singulars_indexed,
.plurals[LV_I18N_PLURAL_TYPE_ONE] = en_gb_plurals_one,
.plurals[LV_I18N_PLURAL_TYPE_OTHER] = en_gb_plurals_other,
.locale_plural_fn = en_gb_plural_fn
};
This would mean, that we have different lv_i18n_lang_t
and lv_i18n_lang_indexed_t
and this would mean, that we have to duplicate all functions taking lv_i18n_lang_t
as a parameter so that there is another similar function dealing with lv_i18n_lang_indexed_t
.
I mean, this is your code, and I will do as you select, but IMHO this adds much complexity to the software without much benefit in clarity.
Please let me know how you want to continue for data design. After this is clear, I will continue with plurals.
This would mean, that we have different
lv_i18n_lang_t
andlv_i18n_lang_indexed_t
and this would mean, that we have to duplicate all functions takinglv_i18n_lang_t
as a parameter so that there is another similar function dealing withlv_i18n_lang_indexed_t
.
IMO, current names difference is caused by old plurals code (because you use both approaches instead of single one). When indexed and non indexed parts will be in separate two sections, probable suffixes will not be required at all.
I mean, this is your code, and I will do as you select, but IMHO this adds much complexity to the software without much benefit in clarity.
Please do data design in any format you prefer. Until data design is complete, all next steps are very fragile. Right now it's not critical how the code looks like. The only principal point is the complete data layout.
I will make a proposal within the next day(s).
Is this the way you want it?
May be. Could you add indexed plurals part. It's very difficult to dicsuss details with incomplete data.
OK.
I hope that I understood you right. Please find a complete example of the code at https://github.com/lvgl/lv_i18n/pull/55/commits/6634aa03b1b3d6711f687e1cad0eaa6f1b3ad0cd. It is done for singular, as well as plural, but it has not been tested. It is meant to help discussion the solution which we try to find. It is hand written and not yet generated.
Please let me know your feedback.
https://github.com/lvgl/lv_i18n/commit/bdbdfe47dd96c4a367db82d546c5977d8b709768 see my proposal.
All code regrouped to leave only 2 sections. IMHO that significantly improves readability/debug/maintenance at a very small cost (duplicated some structs definitions).
_by_idx
, _simple
, _singular
and other suffixes of such kind (not critical now).Tell me, if you agree with this proposal. If you have no objections, then I will refactor fn/struct names for the final variant, and the templates will be ready for coding.
Sorry for the delay, ... Family time...
I have no objections against your proposal. It is fine from my side. I am now waiting for your refactorization for the final variant. Please let me know, if I should do anything.
https://github.com/lvgl/lv_i18n/commit/af62c7e2c95cbb332feb2121e41780cae05fb718 - polished some names.
After first refactoring pass, i found we can unify things even better - switch to concept of compiler/runtime lookup index.
Current "slow search" data:
static lv_i18n_phrase_t en_gb_singulars[] = {
{"s_en_only", "english only"},
{"s_translated", "s translated"},
{NULL, NULL} // End mark
};
#define _(text) lv_i18n_get_singular_by_map(text)
will become:
// For slow search only
static const char * en_gb_singulars_idx[] = {
"s_en_only",
"s_translated"
NULL // End mark
}
// The same as optimized
static const char * en_gb_singulars[] = {
"english only", // 1=s_en_only
"s translated", // 2=s_translated
NULL, // 3="s_untranslated"
};
// The same thing, but with runtime index search
#define _(text) lv_i18n_get_singular_by_idx(text, get_singular_idx(text))
That will unify almost everything:
And no more difference! That will also simplify code generator.
What do you think?
PS happy new year :)
Very good idea!! Great!
Thanks! Happy new year to you as well. We still have 5 hours to go
https://github.com/lvgl/lv_i18n/commits/tmp_template
IMO that's ready to go. I'm not an expert in C, and some minor polish is probably needed. But the data layout should be ok.
I think, you are "off by 1": https://github.com/lvgl/lv_i18n/blob/39a97d43a38170eb341b7909de4b2ff2f906369a/src/lv_i18n.template.new.c#L116
Use txt = lang->singulars[msg_index];
(many times the same error).
Looks nice and simple! How should we proceed? Who is doing what?
I think, you are "off by 1"
You are right. https://github.com/lvgl/lv_i18n/commit/39a97d43a38170eb341b7909de4b2ff2f906369a - I tried to fix indexing enumeration, but missed -1
correction removal.
How should we proceed? Who is doing what?
I can't spend time for C code, but can do JS if that's difficult for you.
IMHO next step - update C tests and make sure everything is right in new template.
IMHO next step - update C tests and make sure everything is right in new template.
Will do.
Done. Please find new version in https://github.com/lvgl/lv_i18n/pull/55/commits/a8958eeca6bf48c18f276e449055bd58730e9ddf
$ make c
gcc -std=c99 -pedantic -Wall -Wextra -Wconversion -Werror -Wno-switch-enum -Wno-double-promotion -Wbad-function-cast -Wcast-qual -Wold-style-definition -Wshadow -Wstrict-overflow -Wstrict-prototypes -Wswitch-default -Wundef -I combined-sample-2 -I unity/src -I build unity/src/unity.c combined-sample-2/lv_i18n.c test.c -o build/test.exe
./build/test.exe
test.c:249:test_init_should_ignore_NULL_input:PASS
test.c:250:test_init_should_ignore_empty_language_pack:PASS
test.c:251:test_init_should_work:PASS
test.c:254:test_set_locale_should_fail_before_init:PASS
test.c:255:test_set_locale_should_work:PASS
test.c:256:test_set_locale_should_fail_not_existing:PASS
test.c:259:test_get_text_should_work:PASS
test.c:260:test_get_text_should_fallback_to_base:PASS
test.c:261:test_get_text_should_fallback_to_orig:PASS
test.c:264:test_get_text_plural_should_work:PASS
test.c:265:test_get_text_plural_should_fallback_to_base:PASS
test.c:266:test_get_text_plural_should_fallback_to_orig:PASS
test.c:269:test_should_fallback_without_langpack:PASS
test.c:270:test_empty_base_tables_fallback:PASS
test.c:271:test_empty_plurals_fallback:PASS
test.c:272:test_empty_content_check:PASS
-----------------------
16 Tests 0 Failures 0 Ignored
OK
gcc -std=c99 -pedantic -Wall -Wextra -Wconversion -Werror -Wno-switch-enum -Wno-double-promotion -Wbad-function-cast -Wcast-qual -Wold-style-definition -Wshadow -Wstrict-overflow -Wstrict-prototypes -Wswitch-default -Wundef -DLV_I18N_OPTIMIZE -I combined-sample-2 -I unity/src -I build unity/src/unity.c combined-sample-2/lv_i18n.c test.c -o build/test.exe
./build/test.exe
test.c:249:test_init_should_ignore_NULL_input:PASS
test.c:250:test_init_should_ignore_empty_language_pack:PASS
test.c:251:test_init_should_work:PASS
test.c:254:test_set_locale_should_fail_before_init:PASS
test.c:255:test_set_locale_should_work:PASS
test.c:256:test_set_locale_should_fail_not_existing:PASS
test.c:259:test_get_text_should_work:PASS
test.c:260:test_get_text_should_fallback_to_base:PASS
test.c:261:test_get_text_should_fallback_to_orig:PASS
test.c:264:test_get_text_plural_should_work:PASS
test.c:265:test_get_text_plural_should_fallback_to_base:PASS
test.c:266:test_get_text_plural_should_fallback_to_orig:PASS
test.c:269:test_should_fallback_without_langpack:PASS
test.c:270:test_empty_base_tables_fallback:PASS
test.c:271:test_empty_plurals_fallback:PASS
test.c:272:test_empty_content_check:PASS
-----------------------
16 Tests 0 Failures 0 Ignored
OK
If everything is OK, I would be glad, if you can take over JS.
Ping? Are you still interested in this? How can we proceed? I am willing to help.
Thanks for pinging it! There are no obstacles from my side. As it's hidden behind the flag it shouldn't be a big risk to add it.
Let's wait a few days if @puzrin comments. If not, we can merge it. (Please fix the merge conflicts)
PR is ready for inclusion.
However, I am not a javascript expert, so my javascript implementation is probably a bit "old-style" and procedural. Sorry for that. Feel free to improve.
@kisvegabor, it seems I can't provide the required amount of feedback within a reasonable time frame. It's better to assign more collaborators. To have easy rollbacks - just release new things in new major version.
I would propose to merge and release as new major version as soon as possible, as this change has a massive performance improvement. As long as noone uses --optimize, the impact should be small. In any case, if there are problems, I am surely willing to help.
All current test cases suceed.
it seems I can't provide the required amount of feedback within a reasonable time frame. It's better to assign more collaborators. To have easy rollbacks - just release new things in new major version.
Thank you for mentioning it.
Merging now. Let's wait for a few more features and make a major release then.
Thanks! I will watch issues for any problem reports
Thank you!
Fixes https://github.com/lvgl/lv_i18n/issues/53