nodemcu / nodemcu-firmware

Lua based interactive firmware for ESP8266, ESP8285 and ESP32
https://nodemcu.readthedocs.io
MIT License
7.67k stars 3.13k forks source link

RFI: Integer vs Float in Lua 5.3 #3220

Open TerryE opened 4 years ago

TerryE commented 4 years ago

This has been discussed in various issues well as documented in the NodeMCU Reference Manual, the Lua 5.3 whitepaper and touched on in Programming in NodeMCU.

The consensus of debate was that the significant memory savings from moving from a 12 byte TValue to an 8 byte TValue were worth the edge issues than might arise from this. In essence this default build gives the RAM density of Lua 5.1 Integer builds but with the runtime convenience of being able to use floating point if needed. So:

To give you some examples:

> =_VERSION
Lua 5.3
> =1234567890
1234567890
> =12345678901
1.234568e+10
> =1234567890/1
1.234568e+09
> =1234567890//1
1234567890
> =12345678901//1
1.234568e+10
> =2<<30  -- Max negative
-2147483648
> =(2<<30)-1 -- underflow
2147483647
> -- 24 bit rounding for large expressions to int
> return '0x%x, 0x%x' % {1234567890, math.floor(1234567890/1)}
0x499602d2, 0x49960300

So I am requesting input from the active committers and any other contributors on the following:

nwf commented 4 years ago

I think this is a perfectly sensible position to take. The Lua manual even says

The option with 32 bits for both integers and floats is particularly attractive for small machines and embedded systems. (See macro LUA_32BITS in file luaconf.h.)

so we can just have the manual say "NodeMCU's Lua is [or defaults to being?] compiled with LUA_32BITS. This comes with the usual caveats about inexact conversions between integer and float values: integers outside (-2^24, 2^24) have no exact floating point representation, and, of course, most floats do not have integer representations."

We might wish to say, in PiN, something like "When programming in C, larger and smaller integer types (e.g., int64_t, uint8_t) are available beyond the Lua numeric system, which uses float and int32_t. When writing mathematical code, feel free to avail yourself of these other types, but remember that results must ultimately be presented back to Lua in types it understands. Of course, the other types may be opaquely encapsulated inside userdata objects, which may be especially attractive for accumulators or session objects."

Are doubles understood in C? I've never had reason to try.

pjsg commented 4 years ago

Yes doubles work fine in C. What came to me as a surprise was that the change on the dev branch was so backwards incompatible with master. Now we do support building dev with lua51, and I think that we support building the lua53 branch with doubles.

For someone who has working code based on master but wants to move forward, what should the advice be? As I see it, the choices are:

1: Stick with the lua51 build -- but that won't be supported for ever 2: Move to lua53 with doubles -- probably the most compatible option 3: Move to lua53 with floats -- will probably require careful attention to the lua application code, but gives you more RAM to play with 4: Move to lua53 with INTEGER only -- only an option if the existing build was INTEGER only.

What is our advice?

nwf commented 4 years ago

We should probably require that all in-tree modules do something sensible in all four build flavors you mention. We might suggest that developers migrating up move in a "1-2-3-4" direction: test with dev in Lua 5.1, if that works, test with dev in Lua 5.3 w/ doubles, &c. For new module development, it's probably best to start with 3 unless there's reason to need doubles in Lua (which seems unlikely?) and test with 4.

TerryE commented 4 years ago

Philip, forget 4. I am not going to do all the extra work to get an int-only version of Lua53 working and supported, when the current default does everything that an app developer would want.

The trouble with 2 is that you haven't been fielding the issues about lack of RAM, and the current default helps address these. A 64 bit configuration is still build option, but not the default.

You still haven't given a really world example of where a 32+32 configuration is going to cause material problems

TerryE commented 4 years ago

What came to me as a surprise was that the change on the dev branch was so backwards incompatible with master

@pjsg Philip, I've been brooding about his one and I really don't understand how dev is so backwards incompatible with release. Lua51 is largely unchanged, though there are some API regressions into the 5.1 API so that modules can compile cleanly with both, but these aren't exposed to Lua programmers.

Lua53 is a new capability and the trade-offs and feature differences have been debated on these issues over this last 12 months. No changes have been introduced that haven't been previously discussed and agreed by those committers engaged with the threads. You seem not to have tracked these discussions and perhaps therefore the outcomes are a surprise to you, but this is outside everyone else's control.

The major "feature breaks" arise from the Lua language and RTL changes 5.1 to 5.3 -- I didn't invent the sub-typing int vs numeric: this was introduced in standard 5.3, and because the standard implementation uses the same size for float and int types there will always be a loss of precision somewhere converting from int to float and back again. These was the same issue that I had to deal with over 50 years ago, when I first learnt FORTRAN IV.

Developers will have a choice: stay with the (frozen) 5.1 or take the new 5.3 with its new features and face some conversion bumps. IMO, there isn't a magic sweet spot where we can cherry pick all that we want to retain from 5.1, but at the same time have all of the nice new 5.3 features.

pjsg commented 4 years ago

I had always used the integer builds because they were faster and used less memory. This lead me to not truly appreciate the difference between integer and number (because they were the same). Then, if you built a float version, all the integers continued to work as you can represent an int in a double exactly.

The current master behavior of (say) rtcmem.write32 is not broken -- if you give it an integer value, it will store that value and will read it back correctly. When I switched to the new lua53 dev branch and started to work on one of my projects (a pendulum clock monitor), then I ran into the issue of the broken modules -- in particular timestamps were not represented correctly. [I had been using a float build I think]. As a specific example, the old code would generate records that look like:

{"at":1594615129.211176,"edge":[0.000000,0.812699]}

and send them via mqtt to a server for recording.

The at field is a timestamp of when the pendulum swung through an optical interruptor sensor. You could argue that the last 2 digits are probably spurious, but this has 14 significant figures. Another approach might be to start time when the platform boots and just record microsesconds since then. Unfortunately that overflows 2^24 well under a minute. Maybe just record milliseconds, but that overflows 2^24 within a day.

You are right -- I have been a lot more involved with this than most users of the lua firmware. If I'm surprised, then I suspect that many others will be surprised too.

It may be that the solution is to be upfront about the differences between the current dev and master and guide people to choosing the right option for their application.

TerryE commented 4 years ago

If you stay in integer then the overflow occurs at 2^31 -- same the old int builds. If you want mSec or uSec resolution, then why not just use the secs + usecs pair that you used with int builds, or even just do your own 64-bit build?

The architecture isn't broken; the reason that rtcmem isn't working is that we've just missed a couple of lua_tonumber calls that should be lua_tointeger. I guess that like the rest of this port, the only way to get this done is for me to do them. I get the message.

pjsg commented 4 years ago

Terry -- you don't need to make all of these changes -- I'm going to start in on some.

TerryE commented 4 years ago

BTW most of these assign the result of a lua_tonumber() to an int (or uint16 in the case of mcp4725.c) or lua_pushnumber() on an int argument. These are a sure sign that the wrong call is being used for both Lua 5.1 and Lua 5.3. It's just that this is bad practice in Lua 5.1 since the conversion is being done implicitly. The "number" calls should only operate on the type lua_Number data, and should not be used for integer values.

umisef commented 4 years ago

Speaking as mostly a "consumer" here --- we have a fair amount of LUA code in currently shipping products which would have to be very carefully audited; And there are several issues I can already think of from the top of my head. These are mostly, but not exclusively, related to representing time, and calculating time differences. But it is the issues I can't think of that worry me...

Given that a lot of this code is running on ESP32 modules, which have (comparatively) lots of RAM, we would probably end up configuring our NodeMCU builds for 64 bit numbers, both to save effort and to not have to worry about subtle problems hiding in rarely exercised code paths. So as long as that configuration remains supported, there would be at least one real-world use of it.

If the 64 bit build config were to disappear at some point, however, we'd have to weigh the benefits of newer NodeMCU releases against the cost of those code audits, and I suspect we'd end up just sticking with the latest version that offers a 64 bit build.

TerryE commented 4 years ago

Rather than have two threads discussing the same issue, and given that @pjsg has opened a second thread #3224 on this, I suggest that we pause discussions here and use Philip's thread.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.