lvgl / lv_binding_micropython

LVGL binding for MicroPython
MIT License
259 stars 166 forks source link

stub file and documention generation #261

Open kdschlosser opened 1 year ago

kdschlosser commented 1 year ago

I wanted to let the maintainers know that I am working on a script that parses the xml output from doxygen and generates a python stub file for the binding. All of the type hinting will be in place and also the docstrings. This will make it easy for a user to develop with as they can place the stub file in the root folder of their project and most IDEs will read the file. The stub file can also have the extension renamed to ".py" and sphinx can be run on that file and the documentation will get generated.

I have the reading of the XML files done and it generates a stub file. I have to do some work surrounding callbacks and I also still have to figure out how gen_mpy decides what functions go under what structures (classes) and also the enumerations as well.

kdschlosser commented 1 year ago

@kisvegabor @amirgon

Have a look at the attached file. I am not 100% finished with it yet but it generates the stub file. I still have to mess about with the enums

To use the script first fun the build.py script in the docs folder. To use the script you have to provide 2 paths. The first is docs/xml and the second if the folder where you want the stub file placed

python3 "lvgl/docs/build.py"
python3 gen_lvgl_stub.py "lvgl/docs/xml" "lvgl/doc"

double quotes are optional

gen_lvgl_stub.zip

kdschlosser commented 1 year ago

would be a right nice thing if the simulator could use the stub file for autocomplete....

amirgon commented 1 year ago

I have to do some work surrounding callbacks and I also still have to figure out how gen_mpy decides what functions go under what structures (classes) and also the enumerations as well.

When we discussed that on some other GitHub issue, I think I've mentioned that gen_mpy.py is capable of generating a json file with metadata that describes all functions, structs etc. I think it would make sense to use that as an input and to deduce everything from there. We can add more information there very easily if needed.

The only thing you really need from the doxygen xmls are the comments, since lv_mpy.py works on the preprocessed sources so all comments are already stripped out and the script is not aware of them.

If you take all the type information from the json file it can also be easier to maintain because lv_mpy.py might change the way it parses things (for example, not long ago we've changed the way enums are arranged). If you take the data from the json you are less vulnerable to such changes.

However, it looks like you didn't choose this path.

would be a right nice thing if the simulator could use the stub file for autocomplete....

Autocomplete is a Micropython feature, unrelated to LVGL or the Micropython Bindings.
It is already supported without any stub files, and works on all platforms including the online simulator. For example, when you type lv. and click TAB, it would display all functions and types in lv module. So for any other class or struct.

kdschlosser commented 1 year ago

even if you use the gen_mpy script you still end up having to parse the xml files from doxygen anyhow in order to collect docstrings which is something gen_mpy doesn't do.

Autocomplete has nothing to do with MicroPython or Python or C, C++ or any other programming language. It is actually the function of the IDE. But because LVGL is a compiled extension module an IDE is not able to view the contents of it and there for is unable to to provide things like autocomplete and type assesment. That is what the stub file does. It is a skeleton of the extension module only for the purposes of containing documentation and type hinting for the classes, functions, methods and variables contained within the extension module.

If you use VSCode, PyCharm or VisualStudio (with python extensions installed) if you take the generated PYI file file and drop it into the root of one of your projects and then in the IDE code editor import LVGL and then start coding you will see what happens.. It makes writing software a lot more pleasant that's for sure. The spinoff of the stub file is being able to build documentation that is specific to the blinding. It will use the names as they are compiled for MicroPython. This way there is an actual API that is able to be referenced if needed. This was one of the largest hurdles when I started messing about with the binding. The naming didn't align with LVGL and I found myself using the simulator and doing a recursive dir on all of the objects in the binding in order to locate the thing I was looking for.

Having the stub file means there would be no need to do that because you are able to see everything right there in an IDE

The simulator has an IDE built into it. You are able to key out code and then run it. The editor has autocomplete but only if something has already been used in the editor and there is no type hinting available. I would imagine at least the autocomplete could be extended so it would pull from a stub file. Maybe type hinting later on down the road.

What is also nice about the stub file and an IDE is the ability for the IDE to follow a data path. so when you call a function and the returned value is say an lv_colot_t structure the IDE is going to know what attribute that structure has available and if you are trying to access an attribute the structure doesn't have the IDE is going to tell you before you load the code onto an MCU and end up getting a traceback.

kdschlosser commented 1 year ago

The biggest issue with gen_mpy is it doesn't have any structure to it. It's basically a flat file with functions. If you look at the code I wrote it's basically the same kind of concept except there is structure to it. Function parsing is handled in the Function class, Structure parsing is handled in the Struct class...etc...

Because gen_mpy uses an external library to do the parsing and that library is broken apart so there are classes for different data types it would have made sense to subclass those data types and monkey patch those subclasses over the top of the original classes in the library so when the parser was called it would use the classes that were made. By doing this any kind of special handling can be taken care of when the code is being parsed. That is the same concept that I have done. Most things get organized as the data is being read and then printed once the parsing is done.

I have tried to add the functionality to gen_mpy and only been faced with having to go to the doxygen xml files anyhow. It took me about 4 hours to write the script I posted. I spent probably 20 hours messing about with gen_mpy to get it to output everything needed and I was not successful at doing it either.

amirgon commented 1 year ago

I understand that you are having difficulties with gen_mpy.py. However, I wanted to suggest an alternative solution to your problem. Instead of duplicating the parsing logic in your script, you could use the generated JSON file, which contains all the metadata you need and is more robust.

You can find the JSON file that is automatically generated along with lv_mpy.c at this link: https://raw.githubusercontent.com/lvgl/lv_binding_micropython/master/gen/lv_mpy_example.json.
Please take a look and let me know if this helps you in any way.

kdschlosser commented 1 year ago

It doesn't contain all the meta data. That's the problem. There needs to be a mapping of LV name to Binding name so I can read the doxygen XML files to collect the docstrings. The structures in the json output is missing the functions I believe.

What I would need in the json output is everything including the parameter names and types for functions and field names and typed for structures. A complete subclassing arrangement. I know that lv_arc subclasses from lv_obj but is it like that for everything.

It basically boils down to this.. You cannot run this code in the simulator, it will have a heart attack.

It needs to be run on a board that is running the binding..


import lvgl as lv

def iter_obj(obj, indent=''):
    for item in sorted(list(dir(obj))):
        if not item.startswith('_'):
            child = getattr(obj, item)

            child_type = type(child)

            if child_type == type:
                print(indent + item + ': class, need parent class if any')
                iter_obj(child, indent + '    ')
                print()

            elif 'function' in str(child_type):
                print(indent + item + ': function, dont know return type or parameters or parameter_types')

            elif child_type == lv.obj_class_t:
                print(indent + item + ': class, parent class obj_class_t')
                iter_obj(child, indent + '    ')
                print()

            elif child_type == lv.font_t:
                print(indent + item + ': class, parent class font_t')
                iter_obj(child, indent + '    ')
                print()

            elif child_type == int:
                print(indent + item + ': int')

            elif child_type == lv._lv_mp_int_wrapper:
                print(indent + item + ': class, parent class _lv_mp_int_wrapper')
                iter_obj(child, indent + '    ')
                print()

iter_obj(lv)

This will do a recursive query of all of the objects in lvgl. It gives you the basic idea of how the structure needs to be laid out in the JSON file. It needs to be an identical replica if how things are accessed in the binding. there needs to be a mapping of each element name to it's c name. All python types needs to be listed for everything. parent classes as well.

kdschlosser commented 1 year ago

So right now the way the documentation is done for LVGL any function that has a data type as the first parameter that begins with an underscore is not being added. The information does get output into the doxygen xml files but I think that breathe is having an issue with it. generating the documentation from a python stub file for the binding would bypass that problem because there would be a name mapping for _lv_obj_t to lv_obj_t in the output JSON. so anything that has _lv_obj_t as a type would be substituted with the correct type name.

amirgon commented 1 year ago

It doesn't contain all the meta data. That's the problem. There needs to be a mapping of LV name to Binding name so I can read the doxygen XML files to collect the docstrings. The structures in the json output is missing the functions I believe.

We can add the missing metadata. Adding the lvgl names should be easy.
What is missing other than that?

What I would need in the json output is everything including the parameter names and types for functions and field names and typed for structures.

Maybe some example of what you want vs. what we have could be useful

A complete subclassing arrangement. I know that lv_arc subclasses from lv_obj but is it like that for everything.

Instead of "subclassing arrangement", in the JSON each type now includes the complete information for both its own members and its parent members. The result is larger, but is not supposed to be flashed to a device so it doesn't really matter.

In LVGL there is not really a class hierarchy. In principle, all objects inherit from lv.obj.

def iter_obj(obj, indent=''):

Instead of using dir in Micropython, I'd rather build the metadata as part of the process of building the lv_mpy.c. See the way obj_metadata, func_metadata, callback_metadata are updated. The advantage is that we have much more information when processing the C API, and not all this information is eventually available in Micropython.

kdschlosser commented 1 year ago

In principle, all objects inherit from lv.obj

lv_color_t does not inherit from lv_obj_t. I would have to say that the majority of LVGL doesn't inherit from lv_obj_t. only the widgets do.

I am in agreement with you on having the docs and stubs made at compile time and not using dir. dir is only able to provide the general layout and not specifics like argument names and types.

Here is an example of how I would handle the JSON data. I have not tested this specific code so there might be a couple of errors in it that have to get sorted out. It handles how everything is laid out in the JSON output for the purpose of reading the doxygen XML files and generating a python stub file and python documentation. There is going to need to be another script that will read the JSON output and use the information to collect the needed bits from the doxygen XML to generate the stub and docs. That is a pretty easy thing to write so long as all the information I need is in the JSON data

Read the comments and the example script at the bottom.

# This class is what I call an instance singleton. It ensures that only a single
# instance exists based on the values passed to the constructor.
# 
# This is handy to have because the user doesn't have to explicitly keep 
# reference to any of the instances. simply pass the same values to the 
# constructor and either a new instance will be made or if one has been 
# made already that will be returned instead.

# this meta class is used on all of the classes that involve JSON output

class JSONMetaClass(type):

    def __init__(cls, name, bases, dct):
        super(JSONMetaClass, cls).__init__(name, bases, dct)
        cls._instances = {}

    def __call__(cls, *args, **kwargs):
        key = list(repr(arg) for arg in args)

        for k in sorted(list(kwargs.keys())):
            key.append(repr(kwargs[k]))

        if key not in cls._instances:
            cls._instances[key] = super(JSONMetaClass, cls).__call__(
                *args, 
                **kwargs
            )

        return cls._instances[key]

class JSONBase(dict, metaclass=JSONMetaClass):        

    def __iter__(self):
        return self.keys()

    def keys(self):
        for key in sorted(list(self.__dict__.keys())):
            if not key.startswith('_'):
                yield key

    def items(self):
        for key in sorted(list(self.__dict__.keys())):
            if not key.startswith('_'):
                yield key, self.__dict__[key]

    def values(self):
        for key in sorted(list(self.__dict__.keys())):
            if not key.startswith('_'):
                yield self.__dict__[key]

    def __getitem__(self, item):
        if item in self.__dict__:
            return self.__dict__[item]

        raise KeyError(item)

    def __setitem__(self, key, value):
        self.__dict__[key] = value

class SimpleJSONBase(JSONBase):

    def __init__(self, c_name, py_name):
        self.py = py_name  
        self.c = c_name

        super(SimpleJSONBase, self).__init__()

class JSONType(SimpleJSONBase):
    pass

class JSONName(SimpleJSONBase):
    pass

class JSONClass(JSONBase):

    def __init__(self, name: JSONName):
        self.name = name
        self.functions = []
        self.variables = []
        self.parents = []
        self.classes = []

        super(JSONClass, self).__init__()

    def __iadd__(self, other):
        if isinstance(other, JSONClass):
            if other not in self.classes:
                self.classes.append(other)

        elif isinstance(other, JSONFunction):
            if other not in self.functions:
                self.functions.append(other)

        elif isinstance(other, JSONVariable):
            if other not in self.variables:
                self.variables.append(other)

        elif isinstance(other, JSONType):
            if other not in self.parents:
                self.parents.append(other)
        else:
            raise RuntimeError('json type not supported by this class')

        return self

class JSONFunctionArg(JSONBase):

    def __init__(self, name: JSONName, type_: JSONType):
        self.name = name
        self.type = type_

        super(JSONFunctionArg, self).__init__()

class JSONFunction(JSONBase):

    def __init__(self, name: JSONName):
        self.name = name
        self.ret_type = None
        self.args = []

        super(JSONFunction, self).__init__()

    def __iadd__(self, other):
        if isinstance(other, JSONFunctionArg):
            if other not in self.args:
                self.args.append(other)
        elif isinstance(other, JSONType):
            self.ret_type = other
        else:
            raise RuntimeError('json type not supported by this class')

        return self

class JSONVariable(JSONBase):

    def __init__(self, name: JSONName, type_: JSONType, default_value=None):
        self.name = name
        self.type = type_
        self.default_value = default_value
        super(JSONVariable, self).__init__()

class JSONModule(list, metaclass=JSONBase):

    def __init__(self, module_name):
        self.module_name = module_name
        list.__init__(self)

    def __iadd__(self, other):
        if isinstance(other, (JSONFunction, JSONClass, JSONVariable)):
            if other not in self:
                self.append(other)
        else:
            raise RuntimeError('json type not supported by this class')

        return self

# example code
# this is really easy to use. It makes sure there are no duplicates and there
# is no need to check for the existance explicitly, just call the constructor
# for a class and go, If it exists already that is what will be returned.

'''
JSONModule(module_name)

# data gotten from pycparser
# when reading for structures
for struct in pycparser.structs:
    # struct parsed and name information collected
    # parent c type and python type collected
    JSONModule(module_name) += (
        JSONClass(JSONName(struct_c_name, struct_py_name)) += (
            JSONType(parent_c_type, parent_py_type)
        )
    )

# parsing function information and doing its thing to 
# find out what class the function belongs to

for func in pycparser.functions:
    # data gotten for a function and the class the function belongs i
    json_func = JSONFunction(JSONName(func_c_name, func_py_name))

    # get the return value
    json_func += JSONType(ret_c_type, ret_py_type)

    # parsing the arguments
    for arg in func.args:
        json_func += JSONFunctionArg(
            JSONName(arg_c_name, arg_py_name), 
            JSONType(arg_c_type, arg_py_type)
        )

    JSONClass(JSONName(class_c_name, class_py_name)) += json_func

# once everything is done

import json

with open(path_to_json_file, 'w') as f:
    f.write(json.dumps(JSONModule(module_name), indent=4))
'''