nodemcu / nodemcu-firmware

Lua based interactive firmware for ESP8266, ESP8285 and ESP32
https://nodemcu.readthedocs.io
MIT License
7.68k stars 3.13k forks source link

Wrong {} behavior #1434

Closed limpkin closed 8 years ago

limpkin commented 8 years ago

Expected behavior

{} operator to pack multiple return values into an array

Actual behavior

several values set to nil instead of their correct values

Test code

Provide a minimal and isolated test which will reproduce the problem.

1) string containing 60 uint32_t:
dada = "000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
2) use struct to unpack them:
data_to_send = {struct.unpack("iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii", dada)}
3) print them individually:
for i=1,60 do print(data_to_send[i]) end
808464432
808464432
808464432
nil
nil
nil
nil
nil
nil
nil
nil
nil
nil
nil
nil
nil
nil
nil
808464432
808464432
...
4) use print for the unpack returns:
print(struct.unpack("iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii", dada))
808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   808464432   241

NodeMCU version

Both dev and main versions have this problem, as of right now.

Hardware

standard esp12-e flashed with nodemcu FW

limpkin commented 8 years ago

Bug present when there are more than 3 values to be returned. As a reminder: "aaaa" is supposed to be cast as 'a'_16777216 + 'a'_65536 + 'a'*256 + 'a' where 'a' is the ascii value of a = 97. The sum therefore is 1633771873

Erroneous behavior:

four_uints = "aaaaaaaaaaaaaaaa"
packed_data = {struct.unpack("iiii", four_uints)}
for i=1,4 do print(packed_data[i]) end
1633771873
1633771873
1633771873
nil

Showing that the bug actually happens:

four_uints = "aaaaaaaaaaaaaaaa"
print(struct.unpack("iiii", four_uints))
1633771873  1633771873  1633771873  1633771873  17
djphoenix commented 8 years ago

Ack this, found same behavior. Will search for corruption place... As a note - index and count of nils are not stable. I found case where fifth element of 14 was nilled, but others are Ok.

limpkin commented 8 years ago

Hello there,

Any update on this issue? I unfortunately can't do any high speed transfer through wifi if it isn't correct.

Thanks!

devsaurus commented 8 years ago

I'd guess that the returned number of pushed elements is wrong. Apparently this doesn't confuse print() but corrupts table construction. But I might be completely wrong...

pjsg commented 8 years ago

The code in the struct module looks good -- I added the struct module, but the code was essentially unaltered.....

devsaurus commented 8 years ago

Agreed, Philip. I experimented with the test case and got the impression that the issue is not specific to struct. Other functions should exhibit the same behavior and I intend to try this next.

devsaurus commented 8 years ago

The following test case should show the same "{} issue" if it was generic, but it doesn't:

spi.setup(1, spi.MASTER, spi.CPOL_LOW, spi.CPHA_LOW, 8, 8, spi.FULLDUPLEX)
packed_data = {spi.send(1, 1, 2, 3, 4, 5)}
for i=1,6 do print(packed_data[i]) end
5
255
255
255
255
255
pjsg commented 8 years ago

I was running a DEVELOP build, and I got

ASSERT@lapi.c(447): L->top < L->ci->top
LUA_API void lua_pushinteger (lua_State *L, lua_Integer n) {
  lua_lock(L);
  setnvalue(L->top, cast_num(n));
  api_incr_top(L);     // this is line 447
  lua_unlock(L);
}
jmattsson commented 8 years ago

There seems to be a missing luaL_checkstack() before the final push in b_unpack(), but that's not what's causing this issue. Are we managing to corrupt the stack and/or heap somehow here? The Non-OS SDK doesn't seem to have a canary between the two, so silent weirdness is a definite possibility that way...

pjsg commented 8 years ago

I only got one copy of the assert (and I don't think that there is any duplicate detection to suppress it).

pjsg commented 8 years ago

Changing the checkstack call to fix the overflow does not affect the outcome -- the nils still appear.

jmattsson commented 8 years ago

Yeah in this case the checkstack() does nothing - there's enough room on the stack already. It seems the actual issue is related to the table-creation and populating. I'm not all that familiar with the Lua VM internals though, that's more your and @TerryE's speciality. At least we have a small, easily reproducible test case to work with :)

pjsg commented 8 years ago

There is something horrible going on. I tried struct.unpack("cccccccccc", dada) and it threw more asserts than I knew what to do with.... As far as I can tell, "cccccc" should unpack one character at a time......

pjsg commented 8 years ago
dada = "000100020003000400050006000700000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
pat = "ccccccccccccccccccccc"
d = {struct.unpack(pat, dada)}
for k,v in pairs(d) do print("index:"..k.." value:"..v) end
print(struct.unpack(pat, dada))

This a number of "c" unpack characters that really starts to cause problems. Between 3 & 19, the table that gets built only has 3 elements. At 19, it gets four elements (1,2,3 & 19)

Bizarrely, the code

function f()
dada = "000100020003000400050006000700000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
pat = "ccccccccccccccccccccc"
d = {struct.unpack(pat, dada)}
for k,v in pairs(d) do print("index:"..k.." value:"..v) end
print(struct.unpack(pat, dada))
end

f()

(just wrapping in a function) -- this behaves differently (and still badly).

ASSERT@lgc.c(181): !iscollectable(&h->array[i]) || (ttype(&h->array[i]) == (&h->array[i])->value.gc->gch.tt)
ASSERT@lgc.c(181): !iscollectable(&h->array[i]) || (ttype(&h->array[i]) == (&h->array[i])->value.gc->gch.tt)
index:1 value:0
ASSERT@lgc.c(181): !iscollectable(&h->array[i]) || (ttype(&h->array[i]) == (&h->array[i])->value.gc->gch.tt)
ASSERT@lgc.c(181): !iscollectable(&h->array[i]) || (ttype(&h->array[i]) == (&h->array[i])->value.gc->gch.tt)
ASSERT@lgc.c(181): !iscollectable(&h->array[i]) || (ttype(&h->array[i]) == (&h->array[i])->value.gc->gch.tt)
index:2 value:0
ASSERT@lgc.c(181): !iscollectable(&h->array[i]) || (ttype(&h->array[i]) == (&h->array[i])->value.gc->gch.tt)
ASSERT@lgc.c(181): !iscollectable(&h->array[i]) || (ttype(&h->array[i]) == (&h->array[i])->value.gc->gch.tt)
index:3 value:0
index:4 value:1
index:5 value:0
index:6 value:0
index:7 value:0
index:8 value:2
index:9 value:0
index:18 value:0
index:19 value:0
index:20 value:index:3 value:0
index:21 value:0
index:22 value:22
0   0   0   1   0   0   0   2   0   0   0   3   0   0   0   4   0   0   0   5   0   22

If you run `f() again a few times, then the value of index:20 changes to other strings.....

pjsg commented 8 years ago

By comparing the OP_SETLIST implementation (in lvm.c) with lua5.2, it appears that the fix is just moving a line of code, I've tried it and it seems to work. I'll post a PR later on today......

jmattsson commented 8 years ago

Fix has now been merged into dev.

@limpkin Thanks for finding and reporting this one! Sorry it tooks us a while to get on top of it.

limpkin commented 8 years ago

Hello Johny,

No worries! I had other projects to take care of in the mean time... thanks for your time to all of you!

limpkin commented 8 years ago

Thanks again to everyone, it works perfectly and allows me to send high volumes of data with a few lines of code without setting up (once again) a complete build environment.

If anyone was to come accross this issue, these are the lines:

-- UDP Server Data receive  
function on_udp_packet_receive(srv, payload)  
-- Here we receive a 60 bytes array of int32_t, little endian  
-- We simply unpack the values and send them as is to the FPGA  
-- The SPI driver is set to 24 bits send  
if string.len(payload) == 240 then  
-- complete buffer update, timed at 5.8ms  
data_to_send = {struct.unpack("<!4iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii", payload)}  
-- delete last byte which contains parsing index  
data_to_send[61] = nil  
spi.send(1, data_to_send)     
devsaurus commented 8 years ago

Interesting application. Are there more details available online? Regarding Fpga type, bitstream (post)processing, udp data transfer...

Sounds like an elegant way to configure these over the air.

limpkin commented 8 years ago

Of course! Unfortunately I haven't started a write-up on my website as I'm still waiting for part of the non-electronics hardware. To quickly summarize, it's a custom clock made of 4 7-segment displays and 60 nixie tubes connected to the internet. The FPGA used is a spartan3 with onboard flash and a custom SPI slave controller implemented. It also features a 60 channels 14-bit PWM controller. The ESP8266 running NodeMCU is therefore in charge of sending the PWM values using SPI and update the 7-segment displays using I2C.

Here's the complete code in case you're interested:

--nixie_clock.lua
-- PWM ADDRs for SEGMENTS:
--     8
--    --
-- 6 |  | 9
--    --  5
-- 4 |  | 2
--    --  . 7
--    3
-- minus starting addr (2):
--     6
--    --
-- 4 |  | 7
--    --  3
-- 2 |  | 0
--    --  . 5
--    1

-- Below: for each number (0->9) a boolean array specifying if we should set a non zero value into the pwm register, following the PCA9624 pwm register order
number_to_reg_val = {{1,1,1,0,1,0,1,1},{1,0,0,0,0,0,0,1},{0,1,1,1,0,0,1,1},{1,1,0,1,0,0,1,1},{1,0,0,1,1,0,0,1},{1,1,0,1,1,0,1,0},{1,1,1,1,1,0,1,0},{1,0,0,0,0,0,1,1},{1,1,1,1,1,0,1,1},{1,1,0,1,1,0,1,1}}
digit_addresses = {0x08, 0x09, 0x0a, 0x0b}
displayed_digits = {0, 0, 0, 0}
start_digit_reg_addr = 2
dot_shown = false
sda, scl = 1, 2
led_noe = 4
-- Below: variables related to time getting
timeZone = 1 -- GMT+1 hours
hour = 0
minute = 0
second = 0
day = 0
month = 0
year = 0

-- returns the hour, minute, second, day, month and year from the ESP8266 RTC seconds count (corrected to local time by tz)
-- credit to kieranc in esp8266.com
function getRTCtime(tz)

    -- returns 2 if a given year is a leap year, else returns 1
    local function isleapyear(y)
       if ((y%4)==0) or (((y%100)==0) and ((y%400)==0)) then
            return 2 
       else 
            return 1 
       end
    end

    -- returns 366 is a given year is a leap year else returns 365
    local function daysperyear(y)
       if isleapyear(y)==2 then 
            return 366 
       else 
            return 365 
       end
    end        

    -- determines whether Daylight Saving Time is in effect for a given hour, day, month and year
    local function isDST(hour,day,month,year)

        -- returns the date for Nth day of the month.
        -- for example: nthDate(2016,3,0,2) will return the date of the 2nd Sunday in March 2016 (the 13th) when DST begins
        --              nthDate(2016,11,0,1) will return the date of the 1st Sunday in November 2016 (the 6th) when DST ends
        local function nthDate(year,month,DOW,week)

            -- returns a number corresponding to the day of the week (0-7 is Sunday to Saturday)
            local function dow(year,month,day)
                local t={0,3,2,5,0,3,5,1,4,6,2,4}
                if month<3 then
                    year=year-1
                end
                return (year + year/4 - year/100 + year/400 + t[month] + day) % 7
            end -- function dow

            targetDate=1
            firstDOW = dow(year,month,targetDate);
            while (firstDOW ~= DOW) do
                firstDOW = (firstDOW+1)%7
                targetDate=targetDate+1
            end
            targetDate = targetDate+(week-1)*7
            return targetDate
        end -- function nthDate

        local startDate=nthDate(year,3,0,2) -- day when DST starts this year (2nd Sunday in March)
        local endDate=nthDate(year,11,0,1)  -- day when DST ends this year (1st Sunday in November)
        if ((month>3) and (month<11)) or                        -- is the date between April and October?
            ((month==3) and (day>startDate)) or                 -- is the date in March but after the 2nd Sunday in March?
            ((month==3) and (day==startDate) and (hour>1)) or   -- is it after 2AM on the 2nd Sunday in March? 
            ((month==11) and (day<endDate)) or                  -- is the date in November but before the 1st Sunday?
            ((month==11) and (day==endDate) and (hour<2)) then  -- is it before 2AM on the 1st Sunday in November?
            return true
        else
            return false
        end
    end -- function isDST

    local monthtable = {{31,28,31,30,31,30,31,31,30,31,30,31},{31,29,31,30,31,30,31,31,30,31,30,31}} -- days in each month
    local secs=rtctime.get()
    local d=secs/86400
    local y=1970   
    local m=1
    while (d>=daysperyear(y)) do d=d-daysperyear(y) y=y+1 end   -- subtract the number of seconds in a year
    while (d>=monthtable[isleapyear(y)][m]) do d=d-monthtable[isleapyear(y)][m] m=m+1 end -- subtract the number of days in a month
    secs=secs-1104494400-1104494400+(tz*3600) -- convert from NTP to Unix (01/01/1900 to 01/01/1970) 
    local hr=(secs%86400)/3600   -- hours
    local mn=(secs%3600)/60      -- minutes
    local sc=secs%60             -- seconds
    local mo=m                   -- month
    local dy=d+1                 -- day
    local yr=y                   -- year
    if isDST(hr,dy,mo,yr) then hr=hr+1 end -- Daylight Saving Time is in effect, add one hour
    return hr,mn,sc,mo,dy,yr
end -- function getRTCtime

-- see if all the LED controllers are here
function check_led_controllers()
    result = true

    for i=1,4 do
        i2c.start(0)                                            -- start i2c
        c = i2c.address(0, digit_addresses[i], i2c.TRANSMITTER) -- see if something answers
        i2c.stop(0)                                             -- stop i2c
        if c == false then
            result = false
        end
    end

    return result
end

-- write a register contents in a PCA
function write_reg_PCA(digit_addr, reg, val)
  i2c.start(0)
  i2c.address(0, digit_addr, i2c.TRANSMITTER)
  i2c.write(0, reg)
  i2c.write(0, val)
  i2c.stop(0)
end

-- update all segment registers inside a pca
function update_segment_registers(digit_addr, reg_bool_array, pwm_value)
    i2c.start(0)
    i2c.address(0, digit_addr, i2c.TRANSMITTER)
    i2c.write(0, start_digit_reg_addr + 128)
    for i=1,8 do
        if reg_bool_array[i] == 0 then
            i2c.write(0, 0)
        else
            i2c.write(0, pwm_value)
        end
    end
  i2c.stop(0)
end

-- display the number array on the 7 segments digits
function display_number_array()
    for i=1,4 do
        number = displayed_digits[i]
        -- don't display a 0 first digit
        if i == 1 and number == 0 then
            i = 2
            number = displayed_digits[i]
        end
        -- display dot or not
        if i == 2 and dot_shown == true then
            number_to_reg_val[number+1][6] = 1
        end
        update_segment_registers(digit_addresses[i], number_to_reg_val[number+1], 0x10)
        number_to_reg_val[number+1][6] = 0
    end
end

-- UDP Server Data receive
function on_udp_packet_receive(srv, payload)
    -- Here we receive a 60 bytes array of int32_t, little endian
    -- We simply unpack the values and send them as is to the FPGA
    -- The SPI driver is set to 24 bits send
    -- The FPGA expect one byte MSB for tube addr, then one byte for PWM MSB, then one byte for PWM LSB
    --gpio.write(led_noe, gpio.HIGH)    
    if string.len(payload) == 240 then
        -- complete buffer update, timed at 5.8ms
        data_to_send = {struct.unpack("<!4iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii", payload)}
        -- delete last byte which contains parsing index
        data_to_send[61] = nil
        spi.send(1, data_to_send)   
    elseif string.len(payload) == 3 then
        -- single tube update 
        data_to_send = payload:byte(1)*65536 + payload:byte(2)*256 + payload:byte(3)
        spi.send(1, data_to_send)
    end
    --gpio.write(led_noe, gpio.LOW)
end

-------------------------
-- NIXIE CLOCK PROGRAM --
-------------------------
-- IO INIT
gpio.mode(led_noe, gpio.OUTPUT)                     -- Led output enable
gpio.write(led_noe, gpio.LOW)                       -- Enable output
i2c.setup(0, sda, scl, i2c.SLOW)                    -- init i2c

-- CHECK LED CONTROLLERS
res = check_led_controllers()
if res == true then
    print("LED controllers here!")
else
    print("Device not found !!")
end

-- INIT LED CONTROLLERS SETTINGS
for i=1,4 do
    -- Register auto-increment, normal mode, responds to all call
    write_reg_PCA(digit_addresses[i], 0x00, 0x81)
    -- LED driver 0->7 individual brightness and group dimming/blinking can be controlled through its PWMx register and the GRPPWM registers.
    write_reg_PCA(digit_addresses[i], 0x0C, 0xFF)
    write_reg_PCA(digit_addresses[i], 0x0D, 0xFF)
end

-- INIT SNTP
sntp.sync("2.ch.pool.ntp.org")

-- INIT SPI AT 20MHZ
spi.setup(1, spi.MASTER, spi.CPOL_LOW, spi.CPHA_LOW, 24, 4)

-- UDP SERVER
srv=net.createServer(net.UDP)
srv:on("receive", on_udp_packet_receive)
srv:listen(43210)

-- CLOCK TIMER
tmr.alarm(0,1000,tmr.ALARM_AUTO,
    function()
        -- get the hour, minute, second, day, month and year from the ESP8266 RTC
        hour,minute,second,month,day,year = getRTCtime(timeZone)
        -- display time
        displayed_digits[1] = hour/10
        displayed_digits[2] = hour%10
        displayed_digits[3] = minute/10
        displayed_digits[4] = minute%10 
        display_number_array()
        -- flip dot
        if dot_shown == true then
            dot_shown = false
        else
            dot_shown = true
        end
        --if year ~= 0 then
        --  -- format and print the hour, minute second, month, day and year retrieved from the ESP8266 RTC
        --  print(string.format("%02d:%02d:%02d  %02d/%02d/%04d",hour,minute,second,month,day,year))
        --else
        --  print("Unable to get time and date from the NTP server.")
        --end
    end
)

I'm hoping to make the writeup available on my website (www.limpkin.fr) in a month or so.