lvgl / lvgl

Embedded graphics library to create beautiful UIs for any MCU, MPU and display type.
https://lvgl.io
MIT License
16.38k stars 3.22k forks source link

Generated API-Page Link Problems #7068

Open vwheeler63 opened 2 days ago

vwheeler63 commented 2 days ago

LVGL version

v9.3

Executive Summary:

While its current algorithm was a vast improvement over having NO API links, I found an important design flaw in doc_builder.py. It is thoroughly described below, but to summarize: it uses a one-to-many relationship between the name of the .RST file and code-element names found by Doxygen, but the correct relationship is a many-to-many relationship. (The one-to-many relationship tied to the .RST file names also restricts .RST file names in a way that they should not be restricted.) Using a one-to-many relationship causes these problems:

An .RST file should be able to link to API header files on a many-to-many relationship. One way to do this (that I like) links to specific keywords in the .RST file's content instead of its filename. This is a reStructuredText COMMENT:

    ..
        API NameKeywords:  init, display, indev

or

    ..
        API ExactKeywords:  lv_event_t

or both (in which case the set of hyperlinks would be combined).

This is simple to implement, and dovetails in nicely with the existing code in doc_builder.py. Perhaps 30-40 lines of code + comments? And it solves all of the above problems including removing the restriction on .RST file names.

Background Information:

Hyperlink references for the API C header file are added to the bottom of some source .rst files where the .rst file content corresponds to code elements found and isolated by Doxygen. This is a good thing, but is only partially implemented. Read on to see why.

To determine what API hyperlinks to add, doc_builder.py needs a string to compare the C-code-element names with.

Currently it is getting this string by isolating the stem of the .rst file name (i.e. libjpeg_turbo in libjpeg_turbo.rst). It then scans through every single C-code element found by Doxygen, removing “_lv_” and “lv_” prefixes, and removing “_t” suffixes, and then reducing the code-element name to the same number of “sections” that the filename stem has (libjpeg_turbo has 2 "sections": libjpeg and turbo). So the lv_libjpeg_turbo_init() function name is reduced to a string libjpeg_turbo. Then a (case sensitive) string comparison is done and if they match, then a hyperlink is added to the API page for the .H file containing that function prototype. You can see the result of this logic here: https://docs.lvgl.io/master/libs/libjpeg_turbo.html#api

A big thank you goes to @kdschlosser who took the time to build that brilliant logic. This took a lot of work and required learning and/or understanding at least part of the Doxygen XML file format (the huge index.xml Doxygen generates).

The Problem:

There are a few subtle design flaws with this approach (that are easily solved):

Additional Information:

I learned the above (and discovered I had broken this logic by several .rst file renames I had done [and have since reverted]) by studying doc_builder.py to really find out what logic was determining what API hyperlinks to add to what .rst files. I needed to know this for several reasons:

Learning how it worked took many hours, and I learned it by adding comments to the high-level parts of the doc_builder.py code so as to map out what was really going on. Adding those comments allowed me to understand what it was doing, simultaneously exposing the situation described above.

Others (including the original writer) will suffer the same fate as I did (not being able to easily understand what doc_builder.py is doing) as long as it remains undocumented.

A resource that could be taken advantage of: doc_builder.py currently already uses a reStructuredText COMMENT .. Autogenerated to know where these links have been added so it can remove old ones before re-computing a new set of links. (Just in case any of these links sneak into source .rst files in the future.) A similar reStructuredText COMMENT could be used to list which API pages should be linked to. This could thus “communicate” to the doc_builder.py logic without affecting the content of the eventual .html page except to signal doc_builder.py what to include.

Isolating the Situation:

The above work done with doc_builder.py to add hyperlinks to API pages was brilliant and is a good start and greatly improved the quality of the resulting documentation.

However, now that all of the above is visible, it is clear that the relationship of an .rst document with the API-page hyperlinks it should have is really a many-to-many relationship that cannot fully be detected by the name of a single .rst file.

Yet valuable API-page links now exist, and we do not want to lose the valid ones.

This is easily solved.

Proposed Solution:

  1. Work out a reStructuredText COMMENT string syntax that could be easily parsed by doc_builder.py to determine what API-pages to link to. This can be a key-word followed by a simple list of names, e.g. “init, display, indev, theme” that will match code-element names, without the writer having to know the exact name of the header files involved, or taking the risk that renaming those header files will break links.
  2. Using that syntax, capture the valid already-existing API-page links and convert them to this syntax while simultaneously removing the “API” section headings at the bottom of all .rst files, except where there are manually-added hyperlinks to API pages.
  3. Modify doc_builder.py so as to:
    • add its own API section headings at the top of where it generates API-page hyperlinks (when they don't already exist), and
    • parse the syntax defined above and use that to generate hyperlinks to the appropriate API pages, simultaneously removing the reliance on a single .rst file name.
  4. Include good documentation in doc_builder.py so that others may understand what it does, and how it does it, and why.
  5. Finally, rename .rst files that could benefit by supplying improved URL paths that more closely align with the document’s title and/or purpose.
  6. Add the new details (how to get API-page links into your .rst files) into ./docs/README.md.

Having studied this, I have all that I need to carry out the above handling.

How to reproduce?

n/a

kdschlosser commented 2 days ago

most of what you have suggested is going to add a very large amount of additional things that are going to need maintenance. It is fairly simple to keep the ReST filenames in alignment with header file names that are seen in LVGL. doing this is 2 fold. Not only does it allow for adding the API links automatically it also make it easier to locate documentation that may need to be updated should when a change gets made to LVGL.

Maybe it would be easier if the RST files were places with the source and header files like what is done for the examples?

kdschlosser commented 2 days ago

The reliance on the .rst file name stem to match against C-code-element names is brittle: in the re-organization work I am doing with the documentation, I inadvertently renamed event.rst to events.rst to match the title, thinking I was doing a good thing by improving clarity of the URL path to the document whose title is correctly “Events”. (This has since been reverted.) Before I discovered the details of the logic above, I had NO realization that I was unintentionally losing a whole set of valuable API hyperlinks due to the above logic. In the future (unless something intervenes), others will encounter the same situation while also lacking this knowledge: creating new .rst files as well as renaming old ones like I did.

It is easy to keep the path/filename structure the same as what is seen in the src folder. It also makes it easier to locate documentation that might need to be updated if the code changes.

The case-sensitivity of the string comparison mentioned above causes it to miss some opportunities to add valid hyperlinks to API pages. For example, let’s say someone in the future adds a file draw_sw.rst to document software rendering. When “draw_sw” is compared to the macros “LV_DRAWSW…”, the comparison done will compare “draw_sw” to “DRAW_SW” and the match will fail due to the case-sensitivity, thus missing the opportunity to add a valid hyperlink to an appropriate API page.

Easily corrected so these will work properly.

This works well for the “details” section of the manual where there is specific, detailed documentation for things like the Button widget in button.rst which the above logic nicely adds links to the API page for lv_button.h. But introductory sections of the manual (e.g. setting up a new LVGL project) should be able to include API links to things like the page for lv_init.h containing the function prototypes for lv_init() and lv_is_initialized(). They should as well be able to link to additional API pages.

and

The need to link to multiple API pages with varied names makes it impossible for a single .rst file name to detect all the possible appropriate API-page links to add. Additional example: there is already documentation on STM32 DMA2D support that exists in a file ./integration/chip/stm32.rst. But no API-page links exist at the bottom of that page, even though several would be appropriate.

and

A document like project.rst (setting up a new project) may need to link to link to several additional API pages that will not be found by the above logic due to its filename: no code-element names will string match with “project”. The appropriate API links that should be there, also vary widely and so will not (and can not ever) match to just one .rst filename stem, even if the file is renamed.

In those situations then the links get manually added. You can see this being done here

This logic also causes some API-page hyperlinks to be included that are not relevant. Example: the link to the API page for lv_types.h is not really relevant to button.rst where it appears. This happened because the string “button” was found in a typedef in that file. This is a relatively minor problem since users can easily skip what they don’t need – better to err in the direction of including more hyperlinks than needed, than to miss including needed ones. But it is still pertinent to this logic, and could be improved. It also impacts the level of professionalism in the LVGL documentation, which is something that could adversely impact decisions to use LVGL.

This can also be corrected if I know what is being linked to incorrectly.

There is NO documentation anywhere (including in doc_builder.py) about how this works, or the .rst file naming restrictions it implies.

That's why your here, :smile:

The section heading “API” is at the bottom of several .rst files, but there is no guarantee that doc_builder.py will add links there. This section heading is also missing from other .rst files that could benefit by having such links. However, doc_builder.py does not use those headings in any way, and is already capable of adding links to .rst documents that do not have this heading.

easily fixed on this as well.

vwheeler63 commented 2 days ago

That's why your here, 😄

Good one! 😄

I may not have been clear about it, but with the work I am already doing with the documentation, it isn't a big step for me to carry out the above.

kdschlosser commented 2 days ago

The way I wrote it is far from perfect as you know. there are edge cases that cannot dealt with easily and without have to make special rules in the build script that can end up breaking for one reason or another. I tried to make it so that the things that are not able to get handled automatically can get handled manually, Hence the API section being added ahead of time. I can wire in a check to see if hat section is in the file and if it is then don't add it and just append the file. It's not that much work to do.

Quite a few of the issues are simple fixes and it's just things I didn't think about.

The whole file structure things is the best I could come up with where it would be the easiest to maintain, I think we just need some documentation for it that states if the file is supposed to have anything linked to in from the API section of the docs then it needs to match the path/filename of the header file.

vwheeler63 commented 2 days ago

The whole file structure things is the best I could come up with where it would be the easiest to maintain

I think the work was brilliant, and made an excellent improvement in the existing situation! As you know, it's common for something this complex to make iterative improvements over time as more is understood about the situation.

P.S. Sent you an email with doc_builder.py with my comments in it so you weren't trying to go through what I went through yesterday.