Closed amirgon closed 1 year ago
We can add the points to check in the PR template:
Here are the things that this list should cover, ordered by importance.
Importance is based on how frequent these are missed, and how severe the problem is when they are missed.
Define gc roots where needed
In my experience people usually don't know what a GC root is. We should explain the gist briefly.
E.g. Do notmalloc
into a static
or global variables. Instead declare the variable in LV_ITERATE_ROOTS
list in lv_gc.h
and mark the variable with GC_ROOT(variable)
when it's used.
Follow callback conventions
How to summarize it in 1-2 sentences?
Follow function naming
How to summarize it in 1-2 sentences?
Use enums instead of macros
Prefer enum
s instead of macros. If inevitable to use define
s export them with LV_EXPORT_CONST_INT(defined_value)
right after the define.
How to define strings and arrays
In function arguments prefer some_type name[]
declaration for array parameters instead of some_type * name
Use typed pointers instead of void pointers
Prefer typed pointers instead of void *
.
Hmm, is it feasible to write a detector for missing GC roots (only for test purposes, not for regular applications)? It seems that the general idea would be to keep a list of all malloc
allocations and verify that they are reachable from at least one of the GC roots. We have been bitten by that specific convention enough times in my opinion. :laughing:
Any idea how to test it? :smile:
The above was just a list. See my suggestion below for "Define gc roots where needed" bullet, comments are welcome!
is it feasible to write a detector for missing GC roots
It would be best to detect that with Static Analysis, where we don't run any code so we don't need test coverage.
What you suggest is Dynamic Analysis which requires running the code.
To run the code we would need an example. If we already have an example that runs the code (and also a Micropython example!) then we'll hit the problem anyway, either with Dynamic Analysis or without it.
Indeed it could be nicer to get a verbose error message about a missing gc-root instead of a segmentation fault, but we'll need to see if it's worth the effort.
btw, allocations don't have to be reachable only from gc roots - they can also be reachable from the stack, so stack would need to be scanned too. And these allocation chains are not limited to lv_malloc
, and would usually go through Micropython allocated objects as well.
A different idea (possibly simpler?) is to find out what parts of the data segment is used for LVGL pointers (can probably be done by querying ELF debug symbols), then scan these pointers to see if any of them is pointing to Micropython heap (the area that is garbage-collected). Ideally this should be tested right before Micropython garbage collection, but we can also test this sooner.
When LVGL runs in Micropython, all dynamic memory allocations (lv_malloc
) are handled by Micropython's memory manager which is garbage-collected (GC).
To prevent GC from collecting memory prematurely, all dynamic allocated RAM must be reachable by GC.
GC is aware of most allocations, except from pointers on the Data Segment:
Such pointers need to be defined in a special way to make them reachable by GC
Problem happens when either global, static global or static local pointer variable is allocated dynamically (with lv_malloc
)
LV_GC_ROOT(_var)
lv_gc.h
on files that use LV_GC_ROOT
_var
to LV_ITERATE_ROOTS
on lv_gc.h
https://github.com/lvgl/lvgl/commit/adced46eccfa0437f84aa51aedca4895cc3c679c
Can we use the MP binding generator to somehow detect functions/patterns not following the convention? E.g. searching for specific text in lv_mpy.c
like "error" if the the binding generation failed.
Can we use the MP binding generator to somehow detect functions/patterns not following the convention? E.g. searching for specific text in
lv_mpy.c
like "error" if the the binding generation failed.
In a limited way - we can probably detect some cases.
Today when the binding script doesn't find user_data
in a callback - it doesn't fail. It simply doesn't generate the binding for that callback. I could easily change the script to fail with an error in such cases.
For example, you can find this in the generated bindings ports/unix/build-dev/lvgl/lv_mpy.c
:
/*
* Function NOT generated:
* Missing 'user_data' as a field of the first parameter of the callback function 'lv_anim_t_exec_cb_callback'
* lv_anim_exec_xcb_t exec_cb
*/
However, if user_data
appears as an argument but is not passed correctly from the callback registration function to the callbacks themselves - we will not detect that problem.
It's hard to detect other things.
The script can't know when it makes sense to use void *
instead of Type *
for example. The script can't guess if a function should be a member of a class or a struct when it's not named correctly, perhaps it is just a global function under lvgl module. etc.
I think showing at least some potential user_data
issues would be an important step. But I wonder how can we filter out the the known "Function NOT generated" reports.
I think showing at least some potential
user_data
issues would be an important step. But I wonder how can we filter out the the known "Function NOT generated" reports.
I agree, but let's first formulate the Coding Convention list, and deal with this later.
I've added above an example for "Define gc Roots Where Needed". Could you comment on this?
@amirgon I think the example is great!
If we formulate a list of conventions in the docs then I think we just need to add a link in the current checklist.
For GC roots I can imagine it like:
Do not
malloc
into astatic
orglobal
variables. Instead declare the variable inLV_ITERATE_ROOTS
list inlv_gc.h
and mark the variable withGC_ROOT(variable)
when it's used. Lear more [here](link to docs with the longer explanation).
I think GC will be good this way.
The 2 other important points are callback and naming conventions. I've grouped the related notes from https://github.com/lvgl/lvgl/issues/1763 . Is there anything else? If the list is ready we can simplify and shorten them.
Follow callback conventions
Follow function naming
For GC roots I can imagine it like:
Do not
malloc
into astatic
orglobal
variables. Instead declare the variable inLV_ITERATE_ROOTS
list inlv_gc.h
and mark the variable withGC_ROOT(variable)
when it's used. Lear more [here](link to docs with the longer explanation).I think GC will be good this way.
Sounds good.
The 2 other important points are callback and naming conventions. I've grouped the related notes from lvgl/lvgl#1763 . Is there anything else? If the list is ready we can simplify and shorten them.
The list covers it I think.
But something when wrong with the copy-paste from lvgl/lvgl#1763, for example lv__create
should have been lv_<widgetName>_create
, etc.
I've tried to summarize the function name and callback conventions. What do you think?
lv_<widget_name>_create(lv_obj_t * parent
) pattern.lv_<widget_name>
and should receive lv_obj_t *
as first argument which is a pointer to widget object itself. struct
s should be used via an API and not modified directly via their elements.struct
APIs should follow the widgets' conventions. That is to receive a pointer to the struct
as the first argument, and the prefix of the struct
name should be used as the prefix of the function name too (e.g. lv_disp_set_default(lv_disp_t * disp)
)struct
s which are not part of the public API must begin with underscore in order to mark them as "private".struct
, that struct
should contain a void * user_data
field. A pointer to the struct
should be the first argument of the callback registration function. The callback should receive either a pointer to the struct
as its first argument OR the same user_data
which was used at registration as its last argument. Callbacks not following these conventions should contain the string xcb
.
- When registering callbacks in a
struct
, thatstruct
should contain avoid * user_data
field. A pointer to thestruct
should be the first argument of the callback registration function. The callback should receive either a pointer to thestruct
as its first argument OR the sameuser_data
which was used at registration as its last argument. Callbacks not following these conventions should contain the stringxcb
.
I think that callbacks, like gc, deserve some background and examples.
Most of the issues we had with Micropython bindings compatibility were related to callbacks. People just don't get the point of user_data
and even when they add it they don't understand why it's needed and often use it the wrong way.
I also think it's better to clearly show the two options for callbacks, one with user_data
as struct member and the other with user_data
as function argument, and provide an example for each.
It's also important to point out that the same user_data
that is provided in the callback registration must be passed over to the callback when it's called.
struct
s should be used via an API and not modified directly via their elements.
This is not accurate. The bindings knows how to handle struct members and allows the user to read and write them.
The goal here was to minimize the API and reduce the binding code size, by making structs private and exposing only their public members through getter/setter API functions.
But if a struct is fully public and the user is supposed to access all its members anyway, there is no point in adding an API to access it. In fact, this would be wasteful and make the API bigger for no reason.
I think that callbacks, like gc, deserve some background and examples.
We can add link to more details but it's important have clear rules that we can easily check without any boilerplate.
I also think it's better to clearly show the two options for callbacks, one with user_data as struct member and the other with user_data as function argument, and provide an example for each. It's also important to point out that the same user_data that is provided in the callback registration must be passed over to the callback when it's called.
It's mentioned, but do you think it's not clear enough? _"The callback should receive either a pointer to the struct as its first argument OR the same userdata which was used at registration as its last argument."
structs should be used via an API and not modified directly via their elements.
This is not accurate. The bindings knows how to handle struct members and allows the user to read and write them.
It's a general LVGL rule too. The only place where we have used direct struct
access was the drivers but I've already changed it in arch/drivers. I've also hidden the huge lv_disp_t
and lv_indev_t
types in lv_disp_priv.h and lv_indev_priv.h.
It's mentioned, but do you think it's not clear enough? _"The callback should receive either a pointer to the struct as its first argument OR the same userdata which was used at registration as its last argument."
It's a long and a bit convoluted sentence... I think that two separate "bullets" that present each option might be clearer.
It's a general LVGL rule too. The only place where we have used direct
struct
access was the drivers but I've already changed it in arch/drivers. I've also hidden the hugelv_disp_t
andlv_indev_t
types in lv_disp_priv.h and lv_indev_priv.h.
Private structs and headers are excellent! They will reduce the binding code, which is already huge.
To make it work with Micropython bindings, just need to make sure that:
void *
but as Struct *
, where Struct is declared on a public header as forward declaration typedef struct Struct Struct;
It's a long and a bit convoluted sentence... I think that two separate "bullets" that present each option might be clearer.
What about this?
struct
that struct
must have a void * user_data
field. struct
for which the callback is registered void * user_data
. struct
as its first argument user_data
which was used at registration as the last argument. xcb
.But it seems it's not accurate as lv_obj
event callback are registered in lv_obj_t
but the callbacks receive lv_event_t * e
as their only argument. So it's neither a "a pointer to the struct
as its first argument " nor _"the same user_data
which was used at registration as the last argument"_ . Although the same user_data
is passed to lv_event_t
which was used during registration. So the callback convention part maybe should be:
user_data
which was used at registration either
struct
as the callbacks first argument If it's correct _"When callbacks can be registered for a struct
that struct
must have a void * user_data
field."_ is not mandatory as the user_data
can be stored anywhere anyhow. The key is that the user_data
used during registration should be passed to the callback in struct or directly as an argument.
To make it work with Micropython bindings, just need to make sure that: [...]
Yep, it works like this.
- When callbacks can be registered for a
struct
thatstruct
must have avoid * user_data
field.
I'm not sure it's clear what "callbacks can be registered for a struct
" means.
Because callbacks don't necessarily relate to structs. Even when they do, the relation can differ from case to case.
In callback registration functions
- the first argument should be a pointer to the
struct
for which the callback is registered- and the last parameter should be
void * user_data
.
This is not true. You don't need both first argument to be a pointer to the struct
and the last parameter to be void * user_data
.
You either need to pass the struct (which contains user_data), or pass user_data as last argument, regardless of struct.
But it seems it's not accurate as
lv_obj
event callback are registered inlv_obj_t
but the callbacks receivelv_event_t * e
as their only argument.
When events are registered in lv_obj_add_event_cb
, the last argument is user_data
.
When they are called, the same user_data
is passed in lv_event_t
.
This is possible only because user_data
is copied correctly. This has special handling in LVGL on event_send_core
, without that - it wouldn't work.
Callbacks should receive the same
user_data
which was used at registration either
- in a
struct
as the callbacks first argument- or as the callback's last argument.
Correct but not complete, it doesn't say how user_data is passed during registration.
I think that to make it accessible and simple for the users, we can simply present the two options:
void * user_data
field.Or
void * user_data
is provided as the last argument of both registration and callback.It's true that we can mix these two options, as long as user_data
is passed from registration to callback, but maybe we can omit that from the short-version of the rules for the sake of simplicity. On the longer explanation we can give the lv_event_t
example for mixing the options.
Some short and simple examples could be very useful to clarify everything, I think.
The key is that the
user_data
used during registration should be passed to the callback in struct or directly as an argument.
Yes, I agree that this is the key. It's good to mention this, but it's not enough.
We must also explain how a "registration" should look like and how a "callback" should look like.
Because the binding script expects a specific "template" of registration and callback prototypes, where arguments must be at a certain position (struct must come first, user_data last), and user_data must be named exactly so, etc.
To make it work with Micropython bindings, just need to make sure that: [...]
Yep, it works like this.
Great!! :+1:
Okay, so it seems it can be quite simple in the end. Together with the other points and conventions:
code-format.py
from the scripts folder. astyle needs to be installed.lv_conf_template.h
run lv_conf_internal_gen.py and update Kconfig.Be sure the following conventions are followed:
enum
s instead of macros. If inevitable to use define
s export them with LV_EXPORT_CONST_INT(defined_value)
right after the define
.type name[]
declaration for array parameters instead of type * name
void *
pointersmalloc
into a static or global variables. Instead declare the variable in LV_ITERATE_ROOTS
list in lv_gc.h
and mark the variable with GC_ROOT(variable)
when it's used.lv_<widget_name>_create(lv_obj_t * parent)
pattern.lv_<modul_name>
and should receive lv_obj_t *
as first argument which is a pointer to widget object itself. struct
s should be used via an API and not modified directly via their elements.struct
APIs should follow the widgets' conventions. That is to receive a pointer to the struct
as the first argument, and the prefix of the struct
name should be used as the prefix of the function name too (e.g. lv_disp_set_default(lv_disp_t * disp)
)struct
s which are not part of the public API must begin with underscore in order to mark them as "private".struct
as the first argument of both the registration function and the callback. That struct
must contain void * user_data
field.void * user_data
and the same user_data
needs to be passed as the last argument of the callback.xcb
. Question: Where? In their type or in their name?Regarding the examples: We have discussed that it would be great if contributors added C and MP examples too. However imagine that if LVGL had more bindings, e.g. C++, MP, Lua, JS, Rust, etc. We can't expect to add examples for all these. Instead we periodically (monthly?) open sponsored issues to updated the examples. We can have script which creates a list of missing examples. If the conventions are followed there can't be a big issue. What do you think?
- Question: Where? In their type or in their name?
In the callback typedef. For example:
typedef void (*lv_anim_exec_xcb_t)(void *, int32_t);
Worth mentioning that xcb callbacks would not be available in other languages. (at least not in Micropython, but probably also not on other auto-generated bindings that need to keep track of callback context without libffi)
However imagine that if LVGL had more bindings, e.g. C++, MP, Lua, JS, Rust, etc. We can't expect to add examples for all these. Instead we periodically (monthly?) open sponsored issues to updated the examples.
In reality, the only active binding we have today is Micropython, right?
Until we have bindings to multiple languages, how about asking/suggesting the contributor to provide Micropython example along with the C example? An example code is a very good way to verify that the conventions are really followed correctly.
In the callback typedef. For example:
What if there is no dedicated type? E.g. void (*my_cb)(void)
In reality, the only active binding we have today is Micropython, right?
The whole picture looks like this:
So strictly speaking really MP is the only binding which has its own examples based on the C examples. However I hope we will have some more active binding soon. A Lua and a Rust binding would be great as well.
Until we have bindings to multiple languages, how about asking/suggesting the contributor to provide Micropython example along with the C example?
Okay, I'll promote adding MP examples until we won't have other bindings.
What if there is no dedicated type? E.g.
void (*my_cb)(void)
I think it could be a good idea to always require a typedef, for clarity.
But we could probably support void (*my_xcb)(void)
as well without typedef, if needed.
Okay, I'll promote adding MP examples until we won't have other bindings.
Great!! 👍
I think it could be a good idea to always require a typedef, for clarity.
Okay :slightly_smiling_face:
Please confirm if the list is good now. If so I'll update the PR template. (I added some missing points from the beginning of the conversation)
After that if you have interest and time the MicroPython docs could be extended with some longer descriptions about the conventions.
Please confirm if the list is good now. If so I'll update the PR template. (I added some missing points from the beginning of the conversation)
Yes, the list is good :+1:
Please add it to the PR template.
After that if you have interest and time the MicroPython docs could be extended with some longer descriptions about the conventions.
Where should they go in the docs? Could you add a placeholder for it?
I would like to explain the most complicated topics in more details:
I can try to summarize these topics and add to the docs.
I think it would be useful to add links from the PR template to these topics on the docs.
Please add it to the PR template.
Done: https://github.com/lvgl/lvgl/blob/master/.github/pull_request_template.md Looks a little bit long, but at least the content is clear and simple now. Let's see how it work in the practice and if needed we can restructure it to look friendlier.
Where should they go in the docs? Could you add a placeholder for it?
Just add a new section after "How can I use it?" like "Code conventions for the MicroPython binding"
I would like to explain the most complicated topics in more details:
:+1:
I think it would be useful to add links from the PR template to these topics on the docs.
Sure, once when it's ready I'll add the links.
Sure, once when it's ready I'll add the links.
@kisvegabor did you add them?
I've looked at pull_request_template.md and can't find them. Maybe I should look someplace else?
Thanks for the ping. I've just added it: https://github.com/lvgl/lvgl/commit/8dc7fc918a000826ad8a1be7d93da4a7bd60c4ba
Thanks for the ping. I've just added it: lvgl/lvgl@8dc7fc9
Great! thanks.
Closing this as completed.
I think it's a great step to avoid a lot of issues in the future.
Related: https://github.com/lvgl/lvgl/issues/3535#issuecomment-1232489218
A short and easy to follow list about MP conventions.
This will be communicated as LVGL coding standard. Less is better so we should have as few rules as possible