vedderb / vesc_pkg

Official VESC Packages
GNU General Public License v3.0
34 stars 42 forks source link

Refloat 1.0 #42

Closed lukash closed 4 months ago

lukash commented 5 months ago

Apologies for the delay, here's Refloat now as a separate repo and added via git subtree.

Note I've also created https://github.com/lukash/vesc_pkg_lib and included it in the refloat repo as a subtree as well. I'd be glad if this setup eventually became standard for 3rd party packages, but if it's not something you want to do, I'll maintain this for myself (and any potential Refloat forks).

The plan is to have one last beta release (or more if something pops up), and then release 1.0 and merge the PR if you like it. As it happens my CI/CD is acting up now due to something timing out, I have an automated release build in place now, so hopefully it'll get back up and I'll be able to use it for the beta6 release.

vedderb commented 5 months ago

Looks good to me! Just need to update res_all.qrc to include the package in VESC Tool.

Regarding vesc_pkg_lib, including it as a subtree in your package works but you can also just include the files directly without a subtree as they are not that large. I was thinking of having your repository as a subtree in this one so that it can be built by itself if needed, but also that is not so important.

lukash commented 5 months ago

Just need to update res_all.qrc to include the package in VESC Tool.

Oh, forgot about that, I'll add it.

Regarding vesc_pkg_lib, including it as a subtree in your package works but you can also just include the files directly without a subtree as they are not that large.

The main point for me is maintainability, I want to be able to pull changes from somewhere without having to manage individual files.

I was thinking of having your repository as a subtree in this one so that it can be built by itself if needed, but also that is not so important.

It's important to me that the Refloat repo can be built by itself :grin: if it couldn't it'd be next to impossible to work with...

lukash commented 4 months ago

@vedderb I just released Refloat 1.0.0 and updated the PR (also adding the package to res_all.qrc).

It's now ready to be merged :tada: Thanks!

vedderb commented 4 months ago

Nice work!

One more thing that came up when I merged the last TnT-PR is this: https://github.com/vedderb/vesc_pkg/pull/43#issuecomment-2157706832

Can you have a look at that and maybe remove VA_OPT (or check for the compiler version and optionally include them)?

lukash commented 4 months ago

Well, I am using them... and I don't want to remove the log messages. I want them there to help troubleshoot potential issues, removing them conditionally in the release build you distribute will kinda defeat the purpose (and introduce an inconsistency between my builds and yours). Instead of "downgrading" my code for a 6 years old compiler, could you try to update it?

Refloat requires GCC 13 because of the enum underlying type specifier it uses for some enums, this could certainly be alleviated, but GCC13 has been working well for me.

I see you mention some size increase in the linked comment, IIRC when I was moving between compiler versions and checking the size, the size decreased somewhat with newer versions, though that was going from GCC 10 or 11 (I don't remember anymore). I also believe a lot of work went into the recent GCC releases and there should be other improvements, namely in LTO which I'm using, etc.

vedderb commented 4 months ago

There are many things I need to test when updating GCC and there are hundreds of new warnings introduced that also need fixing. That is not high up on my priority list and will certainly not happen before 6.05. It is one of those things where I chose to not fix something that isn't broken, especially given the code size increase and performance drop I saw last time (maybe that has been fixed in very latest versions).

Anyway, I'm not going to change my GCC version now and I don't want to have two different versions for building different things, so the path of least resistance is to simply add an ifdef in your code that checks the GCC version and disables the prints for older versions. That should also decrease the code size, so probably LTO is not needed then.

vedderb commented 4 months ago

Another point: using features that are only available in the very latest compiler is not a good strategy if you want people to have an easy time to build your code. It should not be that hard to update the log-messages to not require VA_OPT in a macro. It is even likely that using a regular function for logging rather than a macro decreases the code size as the function only has to be defined once and a function call takes less code space than expanding a macro and running the expanded code in every single place.

lukash commented 4 months ago

There are many things I need to test when updating GCC ...

Which things?

... and there are hundreds of new warnings introduced that also need fixing.

Those warnings are outstanding in the code already, so fixing them would be desirable but ignoring them for now won't change anything.

... the path of least resistance is to simply add an ifdef in your code that checks the GCC version and disables the prints for older versions.

I already said I don't want to remove the diagnostic printfs.

... using features that are only available in the very latest compiler is not a good strategy if you want people to have an easy time to build your code.

The dependency on GCC 13 was unintentional, but I didn't go out of my way to alleviate it after I found out.

It is even likely that using a regular function for logging rather than a macro decreases the code size as the function only has to be defined once and a function call takes less code space than expanding a macro and running the expanded code in every single place.

I can't do that, there's no version of VESC_IF->printf that accepts a va_list.

I tested the resulting binary size of Refloat with both compilers:

(The logging does take up quite a sizable amount of space...)

vedderb commented 4 months ago

Which things?

Stability, no regressions, no stack overflows etc. If more memory is used (both ram and flash) as I saw last time there is potential that things break. I have indeed seen strange behavior when changing compiler versions. Debugging those things is worth doing at some point, but I don't have much time now and it is not high enough on my priorities at the moment.

Those warnings are outstanding in the code already, so fixing them would be desirable but ignoring them for now won't change anything.

Having hundreds of minor/irrelevant warnings make new warnings disappear in the noise, which is a great disadvantage when developing. When moving forwards I have no choice but to spend a non-trivial amount of time fixing warnings, vastly more time than it takes you to write 4 or 5 lines of ifdefs.

I already said I don't want to remove the diagnostic printfs.

I'm not forcing you, I'm just saying that I'm not going to change compiler any time soon for the reasons I explained in great detail. The implication is that it will take some time until refloat appears in the package store. If you prefer waiting over disabling your prints with a handful of lines or making some trivial changes to make them work fine!

The dependency on GCC 13 was unintentional, but I didn't go out of my way to alleviate it after I found out.

It hardly takes going out of your way, unless you are unreasonably fixated on doing it exactly your way. For example, these 5 added lines still give you the prints, but without the refloat, timestamp and error prefix. They also work as before on later GCC during your development.

#if __GNUC_PREREQ(13,0)
#define log_msg(fmt, ...)                                                                          \
    do {                                                                                           \
        if (!VESC_IF->app_is_output_disabled()) {                                                  \
            float t = VESC_IF->system_time();                                                      \
            uint32_t decimals = (uint32_t) ((t - (uint32_t) t) * 1000000);                         \
            VESC_IF->printf(                                                                       \
                "%d.%.6d [refloat] " fmt, (uint32_t) t, decimals __VA_OPT__(, ) __VA_ARGS__        \
            );                                                                                     \
        }                                                                                          \
    } while (0)

#define log_error(fmt, ...) log_msg("Error: " fmt __VA_OPT__(, ) __VA_ARGS__)

#else

#define log_msg VESC_IF->printf
#define log_error VESC_IF->printf

#endif

I tested the resulting binary size of Refloat with both compilers:

I'm talking about the main firmware, not the packages as I want to use the same version for both. If you want to spend some time on testing the main firmware with the latest compiler and confirm that there is at least no regression in code size or performance (even after adding potentially more code to fix the new warnings) I would appreciate that effort.

I can't do that, there's no version of VESC_IF->printf that accepts a va_list.

I'm sure there are many different ways to make useful debug print functions or macros even without that new pre-processor feature. I gave it a few minutes (less time than I have spent arguing with you again...) based on the commands_printf wrappers in the main firmware and got this compiling (although not tested and potentially not working). It is not as compact as the macro-version, but easy to read if you clean it up a bit and more flexible, although with the disadvantage of needing a memory allocation. I'm certain this is not the only way.

#include <stdarg.h>
#include <stdio.h>
#include <string.h>

void *_sbrk(int incr) {
    (void)incr;
    return (void *)-1;
}

static int print_helper(char *buffer, char *msg) {
    float t = VESC_IF->system_time();
        uint32_t decimals = (uint32_t) ((t - (uint32_t) t) * 1000000);
    return sprintf(buffer, "%ld.%.6ld%s", (uint32_t)t, decimals, msg);
}

int log_msg(char *format, ...) {
        va_list arg;
    va_start (arg, format);

    const int buffer_size = 400;

    char *print_buffer = VESC_IF->malloc(buffer_size);

    if (!print_buffer) {
        return 0;
    }

    int offset = print_helper(print_buffer, " [refloat] ");

    vsnprintf(print_buffer + offset, (buffer_size - offset), format, arg);
    va_end (arg);

    int res = VESC_IF->printf(print_buffer);
    VESC_IF->free(print_buffer);

    return res;
}

int log_error(char *format, ...) {
        va_list arg;
    va_start (arg, format);

    const int buffer_size = 400;

    char *print_buffer = VESC_IF->malloc(buffer_size);

    if (!print_buffer) {
                va_end (arg);
        return 0;
    }

    int offset = print_helper(print_buffer, " [refloat] Error: ");

    vsnprintf(print_buffer + offset, (buffer_size - offset), format, arg);
    va_end (arg);

    int res = VESC_IF->printf(print_buffer);
    VESC_IF->free(print_buffer);

    return res;
}

It frustrates me that you absolutely must have it your way even if that requires features from the very latest GCC and causes me days of unavoidable work that is hard for me to justify at the moment. I hope you understand that you only leave me with the options of 1) Spend a lot of work on pleasing you so that you can have things your way, 2) Entering a frustrating argument with you or 3) ignoring your work. I don't like any of these options, so please be a bit more understanding and flexible.

vedderb commented 4 months ago

I had another look and this seems to compile at least on older GCC. Not sure if it works. Can you give it a try?

#if __GNUC_PREREQ(13,0)
#define log_msg(fmt, ...)                                                                          \
    do {                                                                                           \
        if (!VESC_IF->app_is_output_disabled()) {                                                  \
            float t = VESC_IF->system_time();                                                      \
            uint32_t decimals = (uint32_t) ((t - (uint32_t) t) * 1000000);                         \
            VESC_IF->printf(                                                                       \
                "%d.%.6d [refloat] " fmt, (uint32_t) t, decimals __VA_OPT__(, ) __VA_ARGS__        \
            );                                                                                     \
        }                                                                                          \
    } while (0)

#define log_error(fmt, ...) log_msg("Error: " fmt __VA_OPT__(, ) __VA_ARGS__)

#else

#define log_msg(fmt, ...)                                                                          \
    do {                                                                                           \
        if (!VESC_IF->app_is_output_disabled()) {                                                  \
            float t = VESC_IF->system_time();                                                      \
            uint32_t decimals = (uint32_t) ((t - (uint32_t) t) * 1000000);                         \
            VESC_IF->printf(                                                                       \
                "%d.%.6d [refloat] " fmt, (uint32_t) t, decimals ,##__VA_ARGS__       \
            );                                                                                     \
        }                                                                                          \
    } while (0)

#define log_error(fmt, ...) log_msg("Error: " fmt ,##__VA_ARGS__)

#endif
lukash commented 4 months ago

Thanks for the explanation and the code snippets. I understand your standpoint and I respect your time investment greatly. I understand it's work and stuff needs to be tested and you can't do this on a short notice. That said, and not to be taken as as an effort to push more, but just as an explanation of my point of view:

What I don't like is that the work needs to be done at some point anyway, it is an overall improvement. In contrast, coming up with workrounds for the old compiler is wasted work now that even adds some (arguably trivial, but still a thing to address) work for the future (removing workarounds). And most of the workarounds have disadvantages, most remarkably, a difference between my builds and yours - or I have to switch to the old compiler myself, which would be preferable, just so that I am able to catch any weird issue that could actually be occurring on the old compiler and not on the new one, in my package.

And then, it can happen (speaking from experience, although I come from C++ background, where this is more prominent), that one month later I want to implement something and hit a limitation of the old compiler. And it's gonna cost me more time and frustration, while on the new one it'd be easy sailing.

As for the solution, I'll come up with something... It'll just take me more time. And not gonna lie it'll be frustrating switching to the old compiler.

The solution with the function is to me clearly inferior because of the allocation (and with the LEDs, RAM is a precious resource). As for the other solution, I don't need the "refloat" prefix, but the timestamp is quite useful.

Wrt. testing the new compiler, I'm running self-compiled firmwares which were definitely built by GCC 10+, but I forgot which version, for longer than a year (and I've run other builds before that too). I'll recompile it with GCC 13 and run that. I can also compare the GCC 8 build size for that particular controller (Little FOCer v3.1).

vedderb commented 4 months ago

As for the solution, I'll come up with something... It'll just take me more time. And not gonna lie it'll be frustrating switching to the old compiler.

What about the solution I found and posted today that is fully equivalent as far as I can tell if it works?

Yes, C++ has seen huge changes the past 10 years, but not C. Remember, C++ is a gigantic language, vastly bigger than C. C is small, simple and stable over time. C does not try to support every new trend in programming in the way that C++ does. I like that about C; simple, reliable and predictable. Due to the simplicity of C and the decades of developing compilers for it there are very few bugs in the compilers. I haven't run into a single compiler bug in the gcc version I use, bugs have always been my fault!

From the wikipedia-page about C17, which replaces C11 (6 years difference):

C17 fixes numerous minor defects in C11 without introducing new language features.

The standard C23 that requires the latest GCC, the next version after C17, has more changes, but most of them are small and have very few implications on how to write code. You just happened to use one of them.

The things you used from the very latest compiler are really the first time I encounter something that adds functionality to C (and as I discovered today and reported in my last post was already possible with a GCC-extension in the old version). Everything else in your code would work exactly the same 10 - 15 years ago. The main thing I see when changing C compiler version is that I get a bunch of new warnings that I need to fix, newlib grows even bigger than it already was and I need to retest everything. As I already mentioned, the last few times I tried updating the compiler I even got worse performance and code size, so I didn't even bother testing everything then.

Now if you use the one easy-to-work-around feature from the latest compiler you also cause the problem that everyone who has even a slightly older compiler will discover that the code no longer compiles, get frustrated, ask me questions and be forced to update the compiler. Absolutely not worth it, especially given how trivial it is to work around it to keep it compatible with old compilers.

We are different and we change over time. 10 years ago I was always on a linux distro with the latest versions of gnome an all libraries and I was excited about the new features that came with it. Now I only use LTS-versions of Ubuntu and only upgrade when something breaks and forces me to do so. The big difference is that I get much more done today with this approach and use the time I save for other things I'm more excited about.

lukash commented 4 months ago

@vedderb please don't merge this yet, I pushed the branch and realized I can't revert the PR back to draft.

I fixed the build for gcc 8, the solution with ##__VA_ARGS__ indeed works, it's a gcc extension that solved the problem before __VA_OPT__ was introduced.

However, I built the package with gcc 8 and installed it to test it, and it was acting up on me, giving me short motor cutouts. I recorded a short log, attaching in case you wanna have a quick look. The ERPM dips in the graph mark the cutouts. I wasn't able to make much out of it. The issue seemed to disappear once I wrote the package config and right now I can't reproduce it.

I'm not 100% positive it's the gcc 8 that caused it (and it's very weird indeed), but I haven't had anything like this for the whole beta period. I'll need to test it more...

2024-06-21_19-12-16.csv

lukash commented 4 months ago

The main thing I see when changing C compiler version is that I get a bunch of new warnings that I need to fix, newlib grows even bigger than it already was and I need to retest everything. As I already mentioned, the last few times I tried updating the compiler I even got worse performance and code size, so I didn't even bother testing everything then.

Getting worse performance (long-term, I guess short-time regressions are possible) would be weird, as one would think it's a focus of the developers.

I now remembered I use gcc 10 to compile the FW, as latter versions failed to compile. The reason is the binary is indeed ~60kB larger and didn't fit into the memory section. Compiling with gcc 13 gives basically the same size increase. I suspect this has a very distinct reason and I'll try to have a look at it later.

We are different and we change over time. 10 years ago I was always on a linux distro with the latest versions of gnome an all libraries and I was excited about the new features that came with it. Now I only use LTS-versions of Ubuntu and only upgrade when something breaks and forces me to do so. The big difference is that I get much more done today with this approach and use the time I save for other things I'm more excited about.

I totally understand... I am still trying though, some time ago I switched to NixOS, and while the learning curve was very steep (and not something I really wanted to sink time into), it's an incredibly powerful tool for developers. It's excellent at creating custom ad-hoc runtime environments which are 100% reproducible for development and other things. It's in particular very useful when you develop multiple projects for ARM, where each requires different compiler version. I can have different compilers for Refloat, bldc and other projects and switch them in a matter of about 3 seconds. Some of the new stuff coming out is actually useful :grinning:

lukash commented 4 months ago

So I think I got a HW issue with the hall sensors which just coincided quite well with me installing the package built with gcc 8. I'd prefer a couple more days of testing after I fix my board, I'll let you know.

lukash commented 4 months ago

I tested it a bit more and I think all is good. You can go ahead with the merge whenever you want. Thank you for the patience!

vedderb commented 4 months ago

Perfect, nice work!

vedderb commented 4 months ago

I had to do some minor tweaks for it to build and to render the readme correctly in VESC Tool, but that is done and refloat is available from the package store in VESC Tool now!

vedderb commented 4 months ago

There are the changes I did btw:

https://github.com/vedderb/vesc_pkg/commit/c0ebb6d98551a1787d27c4e47c0d527793d16397 https://github.com/vedderb/vesc_pkg/commit/cc8d32de004f4ca9acca3a810743617432863163 https://github.com/vedderb/vesc_pkg/commit/f3148807074122d16479bb83413d3ee5cb73db6e

vedderb commented 4 months ago

Can you try if the version from VESC Tool works correctly? I used the same GCC I use for the firmware, so I had to tweak the ld-script to fit the binary. There is 128k space in flash, so that should be fine. Hope nothing else broke with this version of GCC.

vedderb commented 4 months ago

One tip for checking if the README renders correctly is to use the markdown-editor in VESC Tool. bild

lukash commented 4 months ago

Oh, just noticed these comments here @vedderb. Sorry about the underscore typos in the macro, I tested it multiple times back and forth, no idea how I managed to mess it up in the end.

For the markdown, the fixes you had to do seem weird to me:

  1. I don't think the markdown was invalid before, do you have some special renderer that handles these differently?
  2. I tested the rendering the way you describe, with vesc_tool 6.02 and it looked correct and well.

About the compiler (and increasing the size in the linker script), I meant to tell you, but forgot. Your version of GCC 8 is probably quite old for that major version. The version of GCC 8 I have available actually supports the __VA_OPT__ macro. I assume it also creates somewhat small binaries, because I didn't run into the size issue with it (though it must be terribly close).

For the size I plan to make it configurable in my copy in https://github.com/lukash/vesc_pkg_lib if that'll be possible.

lukash commented 4 months ago

Can you try if the version from VESC Tool works correctly?

Just did a quick test, seems to work fine :+1:

vedderb commented 4 months ago

For the markdown, the fixes you had to do seem weird to me:

  1. I don't think the markdown was invalid before, do you have some special renderer that handles these differently?
  2. I tested the rendering the way you describe, with vesc_tool 6.02 and it looked correct and well.

It was correct, but the rendering in VESC Tool has some issues at the moment. I'm using a customized version of maddy https://github.com/vedderb/vesc_tool/tree/master/maddy

Before (in 6.02) I was using the native markdown-support in Qt, but it had some issues that I wasn't able to work around, especially that there was no way of rendering code blocks. With maddy I can customize the rendering myself and fix issues as I find them.

If you feel like having a go at fixing some of the issues please give it a try. I haven't had much time to work on it, the fixes I made to maddy are not very clean yet.

About the compiler (and increasing the size in the linker script), I meant to tell you, but forgot. Your version of GCC 8 is probably quite old for that major version. The version of GCC 8 I have available actually supports the __VA_OPT__ macro. I assume it also creates somewhat small binaries, because I didn't run into the size issue with it (though it must be terribly close).

I'm using gcc-arm-none-eabi-7-2018-q2, which is the version that the build script downloads when using it. I suspect that the difference in LTO-support causes the size difference.

For the size I plan to make it configurable in my copy in https://github.com/lukash/vesc_pkg_lib if that'll be possible.

I think you can simply update the linker script to a max size of 96k, that is what I did in the firmware repository. No need to limit it to 64k really as the flash-sector is 128k and if you only use lisp for loading the native library you have access to most of those 128k.

lukash commented 4 months ago

If you feel like having a go at fixing some of the issues please give it a try. I haven't had much time to work on it, the fixes I made to maddy are not very clean yet.

I'll have more priority work but point taken, I'll see if I can help out eventually.

I'm using gcc-arm-none-eabi-7-2018-q2, which is the version that the build script downloads when using it. I suspect that the difference in LTO-support causes the size difference.

Ouch, that's my bad once again then, I misread and thought you have GCC 8... :facepalm:

I think you can simply update the linker script to a max size of 96k, that is what I did in the firmware repository. No need to limit it to 64k really as the flash-sector is 128k and if you only use lisp for loading the native library you have access to most of those 128k.

Right. I originally wanted to make it configurable because you had that 64k limit and I wanted to preserve the original configuration as an option. I can just bump it up as well.