avrdudes / avr-libc

The AVR-LibC package provides a subset of the standard C library for AVR 8-bit RISC microcontrollers.
https://avrdudes.github.io/avr-libc/
Other
250 stars 54 forks source link

Improvement(s) for eu_dst #956

Closed 3-14152898 closed 5 months ago

3-14152898 commented 5 months ago

As already reported, function eu_dst() in time.h incorrectly sets DST at 2:00 UTC instead of 1:00 UTC (see https://en.wikipedia.org/wiki/Daylight_saving_time_by_country). Conversely it uses division and modulo operations which are unnecessary besides CPU intensive — the call to gmtime() is intensive enough and provides everything required for the computation.

Attached is my suggestion as a patch — which Github still pretends to refuse even though it ends with .patch so here it is as a ZIP : eu_dst.h.patch.zip . I also implemented a demo sketch in Wokwi for illustration purposes: https://wokwi.com/projects/394687644304051201

Cheers.

dl8dtl commented 5 months ago

Well, the usual method to submit patches to Github is to fork the repository, prepare everything in a branch in your forked copy, and after uploading, follow the steps to create a pull request. Keeping to this standard procedure makes it easier for maintainers to look at your work, optionally suggest modifications, and eventually integrate everything – all without fiddling with individual patch files.

sprintersb commented 5 months ago

Added the proposed patch. However, implementing functions in headers is usually frowned upon; so maybe this warrents a new module?

sprintersb commented 5 months ago

...moved eu_dst into its own module.

However, usa_dst suffers from the same problems I guess? I.e. wrong / bogus computation and implements code in a header.

sprintersb commented 5 months ago

Done. Also moved usa_dst into its own module.

3-14152898 commented 5 months ago

Well, the usual method to submit patches to Github is to fork the repository, prepare everything in a branch in your forked copy, and after uploading, follow the steps to create a pull request. Keeping to this standard procedure makes it easier for maintainers to look at your work, optionally suggest modifications, and eventually integrate everything – all without fiddling with individual patch files.

Noted. While I'm (moderately) used to Git, I have never submitted any pull request for the simple reason I don't know how it works. (FTR I only created this account because at that time Xfce Geany moved their entire code to Github and my bug reports were moved accordingly.) I'll study that procedure then.

Thanks for accepting my patch.

Have a great day.

3-14152898 commented 5 months ago

Added the proposed patch. However, implementing functions in headers is usually frowned upon; so maybe this warrents a new module?

Not necessarily. It is not so unusual, for instance, in C++ to have implementations use only one single header file. The same can be achieved in C and GCC __attribute__((always_inline)) could be used in order to enforce inlining.

sprintersb commented 5 months ago

It is not so unusual, for instance, in C++ to have implementations use only one single header file. The same can be achieved in C and GCC __attribute__((always_inline)) could be used in order to enforce inlining.

C++ works differently. It uses features like comdat to make sure multiple instances of the same template won't result in multiple instances in the binary. We don't have that in C.

(Always) inline is appropriate for (very) small functions, but not for functions that large.

sprintersb commented 5 months ago

Conversely it uses division and modulo operations which are unnecessary besides CPU intensive — the call to gmtime() is intensive enough and provides everything required for the computation.

Question: gmtime_r looks much more CPU intense than the original implementation of eu_dst. So the only purpose of the new implementaton is to save code space, not CPU load?

3-14152898 commented 5 months ago

It is not so unusual, for instance, in C++ to have implementations use only one single header file. The same can be achieved in C and GCC __attribute__((always_inline)) could be used in order to enforce inlining.

C++ works differently. It uses features like comdat to make sure multiple instances of the same template won't result in multiple instances in the binary. We don't have that in C.

(Always) inline is appropriate for (very) small functions, but not for functions that large.

Well, I'm aware there are differences (being a C++ developer myself). I just raised it as an example as to why single-header functionalities are not always frowned upon. (IMHO they shouldn't but that is only my opinion.)

And as for inlining or not, the compiler has the final word, depending on the context of course. All in all, I don't see many reasons to avoid using it. But that's only my position, of course. You are the ones behind the wheel ;-) .

(FWIW it is true that I was a little surprised to see DST functions implemented in a header file but as it is not supposed to be called directly from the application code but from the library internals, it seemed to make sense to me.)

3-14152898 commented 5 months ago

Conversely it uses division and modulo operations which are unnecessary besides CPU intensive — the call to gmtime() is intensive enough and provides everything required for the computation.

Question: gmtime_r looks much more CPU intense than the original implementation of eu_dst. So the only purpose of the new implementation is to save code space, not CPU load?

I haven't check the implementation of gmtime_r() (thoroughly) but I can assure you the goal is to reduce CPU load somehow. Since modulo and division are emulated on 8-bit AVR, therefore relatively expensive, a simpler algorithm is beneficial, regardless of how gmtime_r() is implemented. My point being that enough clock cycles are consumed in that function, so let's not waste more than necessary.

Consequently though the code may save some space and might happen to be more readable — it's true that I had to spend quite some time on the original code to understand what it did and why that way... until I realized gmtime_r() makes the calculation way simpler.

sprintersb commented 5 months ago

I haven't check the implementation of gmtime_r() (thoroughly) but I can assure you the goal is to reduce CPU load somehow. Since modulo and division are emulated on 8-bit AVR, therefore relatively expensive, a simpler algorithm is beneficial, regardless of how gmtime_r() is implemented.

NACK. gmtime_r() is sprincled with division and module, much more than the initial eu_dst implementation used. So CPU load is going up. And code size will only go down when both functions are used in the application. I don't know enough about time applications to judge whether that is a reasonable assumption though.

My point being that enough clock cycles are consumed in that function, so let's not waste more than necessary.

Smaller code does not imply less CPU cyles. In particular when functions like gmtime_r() are called that are very expensive, much more expensive than the original eu_dst implementation.

Consequently though the code may save some space and might happen to be more readable — it's true that I had to spend quite some time on the original code to understand what it did and why that way...

Readable and comprehensible soures are nice, but what's paramount for the end user is functionality and performance.

We have many sources in the code base that are not easy to digest, like assembly code for floating point.

Increasing their understandability could be achieved by providing better comment and documentation, for example. Just because you understand it better is not a reason for introducing expensive code...

I am more and more inclined to revert that stuff.

3-14152898 commented 5 months ago

I don't understand your point, I mean really.

The initial implementation does use gmtime_r()¹. It also does use division and modulo, both of which are costly on 8-bit AVR, don't you agree?

My suggestion completely removes that for a simpler... no, a much less cumbersome decision algorithm with no division, no modulo, only subtraction and compares, obviously. I have compiled it and it causes a significant reduction in code size — because, yeah, I did compare the assembly differences. My implementation is not only faster (by a tiny bit, it's very likely) but also smaller in code size...

Did you actually try it? If you want me to provide evidence, I'm all yours. Maybe an excerpt from the relevant assembly will speak for itself better than a thousand (lame) words from me.

Increasing their understandability could be achieved by providing better comment and documentation, for example.

I concur. With that said, I noticed the last comment line is partly incorrect. I shall change it.

Here you go: eu_dst.h.patch.zip

I am more and more inclined to revert that stuff.

Needless to say I am becoming more and more disappointed about that :( .

I am advocating for my stuff, I think you understand that. You may be reluctant to add my suggestion but please, try it and see for yourself. I'm totally inclined to changing a few things if necessary.

Please try it for yourself and make yourself an opinion about the end result. Otherwise like I said — I don't know why I need to convince you, IMHO the code should speak for itself as to the actual results — if you want me to provide evidence, I will.

¹ The call to gmtime_r() is the only thing I kept from the original implementation, so saying I provide more expensive code just because of that is totally unfair, let alone totally incorrect.

3-14152898 commented 5 months ago

And in case you might wonder here is my compilation settings:

I have not tried older compilers (they are not all available and I'm using Manjaro Linux).

3-14152898 commented 5 months ago

Also note that regardless of what you've said so far, the current implementation of eu_dst() is wrong as it doesn't switch at the right time anyway... (There are two open submissions dated 2022 on that regard BTW.)

3-14152898 commented 5 months ago

Okay, I have benchmarked my suggestion against the original implementation of eu_dst on an Arduino Uno r3 running at 16MHz. Compiler settings are identical to what I've posted earlier. I'm using timer 1 with a /8 prescaler. The call I'm timing is the following:

eu_dst(&timer, nullptr);

with timer being a variable initialized as follows:

time_t timer = time(nullptr);

I consistently get 534 ticks for the original version against 469 with mine, i.e. ~266µs for the former and ~235µs or a 10% to 12% speed gain. You're free to consider that or not, as well as trust me or not...

sprintersb commented 5 months ago

The initial implementation does use `gmtime_r().

WAAAHHH I missed that. Sorry for all that noise.

3-14152898 commented 5 months ago

Well, happens to the best. We're all humans after all ;-) .