scottchiefbaker / ESP-WebOTA

Simple web based Over-the-Air (OTA) updates for ESP based projects
MIT License
283 stars 38 forks source link

Error in math using millis & delay #8

Closed mshoe007 closed 4 years ago

mshoe007 commented 4 years ago

See this post someone put on reddit about using math to compare millis(): https://old.reddit.com/r/esp32/comments/g0jqux/being_stupid/fnbznvj/

This function incorrectly uses a signed number for 'last':

void WebOTA::delay(int ms) { int last = millis();

while ((millis() - last) < ms) {
    OTAServer.handleClient();
    ::delay(5);
}

}

The math involving wraparound as millis() increments to its largest possible value and goes back to zero only works if you do math using the exact same size and type of variables as the type and size returned by the function millis();

This can be challenging because you have to know the type and size of the millis(); function and hope it doesn't change between ESP8266 and ESP32 or whatever future platform you reuse this code on.

The best way to do it is to use the C built in method:

decltype(millis()) last = millis();

This ensures 'last' is the same size and type as the return value of millis, and that ensures this comparison always works, even if millis() rolls over to zero and starts counting up again during the interval: (millis() - last) < ms)

https://en.cppreference.com/w/cpp/language/decltype

[[ Also, while not a bug, it's better to used 'unsigned int' for things that can't be negative. The argument to your delay function is better off as just an 'unsigned int'. ]]

Thanks.

scottchiefbaker commented 4 years ago

This is very informative, I had no idea. The function argument should totally be unsigned, that's an easy change. Would it be better or simpler just to say:

unsigned int last = millis();

Instead of

decltype(millis()) last = millis();

That's pretty gnarly to read.

mshoe007 commented 4 years ago

Actually, you sort of proved my point. Did you know that the function prototype for millis in the current esp8622 core framework is: cores/esp8266/Arduino.h:unsigned long millis(void); It's the same in the current esp32 framework: ./cores/esp32/esp32-hal.h:unsigned long millis();

There's a reason it got added to the language - just this. So, you don't have to know what the function type is, and better, when some other developer changes the type of millis() 'unsigned long long' to deal with the rollover by making the variable larger so that it won't rollover in our lifetime, your code automatically also gets a larger size for 'last'

In other parts of your code, you used an unusual sort of construct to make the large HTML/javascript string. And, you documented it with a http link to the C language source. That's why I did the same in the bug report. The code is gnarly!

I did google search on 'arduino esp8266 millis source' to find the source, and I saw several hits on bug reports involving millis - lots of "you should change the type of the variable", and worse, bug reports on the arduino framework itself where the arduino developers are changing the type of millis() back and forth. ARRRG. This is what decltype() does - an attempt to protect your code from other people's changes.

I see at least one other arduino framework developer got frustrated and did this: ./cores/esp8266/PolledTimeout.h: using timeType = decltype(millis()); ./cores/esp8266/PolledTimeout.h: using timeType = decltype(ESP.getCycleCount());

Here's a really great use of decltype in Arduino's definition of min() and max() macros ./cores/esp8266/Arduino.h:#define _min(a,b) ({ decltype(a) _a = (a); decltype(b) _b = (b); _a < _b? _a : _b; }) ./cores/esp8266/Arduino.h:#define _max(a,b) ({ decltype(a) _a = (a); decltype(b) _b = (b); _a > _b? _a : _b; })

the 'obvious' macro definition of

define min(a,b) ((a < b) ? a : b)

does really unexpected things used like min(lightlevel++, 5) as lightlevel gets incremented twice. Or, possibly worse if a function call is used as an argument:

WorstCallResponseTime = max(WorstCallResponseTime, Call911AndWaitForAmbulance())

Yeow - you probably didn't want to call 911 twice!

The arduino versions of min/max store the macro parameters in local variables so they only get evaluated once. But what type do you use for the local variables? If you use some sort of int type, it fails if a or b are floating point! You'd have to write min/max macro variants for tons of different combinations of variable types. Using decltype() makes it all go away.

Sorry for going on so long.

'unsinged int last' is not good. It turns out to work, though, as sizeof(unsigned int) and sizeof(unsigned long) are both 4 on esp32 and esp8266 using the Arduino framework. That may not be true when there's an ESP64 chip. Or someone takes your code to use on some other chip in the arduino family where the types for that chip are not the same width, or more likely, the folks who ported the arduino framework to that chip made millis() have a different type than 'unsigned long'

Better to use 'unsigned long last;'

Best to use 'decltype(millis()) last;'

scottchiefbaker commented 4 years ago

OK you convinced me... I changed my code to decltype.

Just so I fully understand... we know from reading the code that millis() returns an unsigned long, and we want last to have the same type? If we use decltype on last it will always match whatever millis() returns. Especially in the future if things in the respective ESP32/ESP89266 libraries change.

What would the ultimate risk be if I left it unsigned long? Is the risk that ESP64 comes out and declares millis() as unsigned long long and then last isn't big enough to do the math properly?

mshoe007 commented 4 years ago

Yes to every question in middle paragraph.

Yes to your 2nd question in last paragraph which answers the question right in front of it.

That scenario (64bit millis() and 32-bit unsigned int long) fails when millis() is more than what a 32-bit number can hold, and the upper bits get lost when assigned to 32-bit unsigned int last. Then even before millis() can increment, the first test does a subtraction and instead of getting zero gets a huge number!

It's easier to imagine if millis() currently returns an 8-bit unsigned char and last is also 8-bit bit unsigned char. (Not very useful as milli() rolls over every quarter second!) Somebody realizes that and changes millis() to be an unsigned 16-bit variable so it's more useful. But, your code still has an 8-bit value. You think that's fine because you never need to time things more than 256 milliseconds.

But, things now break the moment millis() returns 256 and you assign that into last

last = millis() // 256 gets truncated to zero since last is only 8-bits wide

The very first test of "while ((millis() - last) < ms)" becomes: while ((256 - 0) < ms)

Oops. The while loop never executes

Thanks for the change.

scottchiefbaker commented 4 years ago

Perfect... I appreciate the explanation. C++ isn't my primary language, I'm still learning.

Clearly you have a much better grasp of the language than I do. I'm open to any other suggestions about my code as well :)

scottchiefbaker commented 4 years ago

@mshoe007 unrelated I ran in to some low level C++ stuff in another library I don't fully understand. Is there a way I can contact you and get your expert opinion on taking my C++ to the next level?