esp8266 / Arduino

ESP8266 core for Arduino
GNU Lesser General Public License v2.1
15.99k stars 13.34k forks source link

sscanf() not assigning proper %n arg value and causing exception(9) #7726

Closed tvonderhaar closed 3 years ago

tvonderhaar commented 3 years ago

Basic Infos

Platform

Settings in IDE

Problem Description

sscanf() is both improperly assigning %n format argument value and in a section of fairly complex code it cause a exception (9) defined as unaligned memory read or write.

Here is a small code segment that exhibits the improper consumed character count being assigned:

################################################

#include <stdio.h>
#define CFG_LINE_LEN                 133

void setup() {
  // put your setup code here, to run once:
  bool     done,
           tooLong;
  u_long   sec_msk;
  char     line[CFG_LINE_LEN + 1] = "lamp: name=living",
                                    *linep = line,
                                     val_str[100],
                                     arg_str[22],
                                     sec_str[12],
                                     op[2];
  int      ln_cnt = 0,
           arg_cnt,
           cnt = 0,
           cnt2 = 0,
           slen;

  Serial.begin(115200);

  delay(4000);

  //get section name.
  op[0] = op[1] = '\0'; sec_str[0] = '\0';
  if ((arg_cnt = sscanf(linep, " %10[^ :\t] %1s%n", sec_str, op, &cnt)) != 2 ||
      op[0] != ':') {
    printf("Bad syntax line:'%s',op:'%s',cnt=%d\n", linep, op, cnt);
  } else {
    printf("*linep(%d):%s,sec_str:'%s',op:'%s',cnt=%d\n",
           __LINE__, linep, sec_str, op, cnt);
  }

  //get arg name.
  linep += cnt; //get to arg name.
  op[0] = op[1] = '\0';
  if ((arg_cnt = sscanf(linep, "%*[ \t,]%20[^ =\t] %1s%*[ \t]%n",
                        arg_str, op, &cnt)) != 2 || op[0] != '=') {
    printf("Bad syntax *line(%d):'%s',arg_cnt=%d,op:'%s',cnt=%d\n",
           __LINE__, linep, arg_cnt, op, cnt);
  } else {
    printf("*linep(%d):'%s',arg_cnt=%d,arg_str:'%s',op='%s',cnt=%d\n",
           __LINE__, linep, arg_cnt, arg_str, op, cnt);
  }

  //get arg value.
  linep += cnt; //get to arg value.
  val_str[0] = arg_str[0] = '\0'; op[0] = '\0'; op[1] = '\0'; cnt = -1;
  if ((arg_cnt = sscanf(linep, "%*[ \t,]%20[^ ,\t] %1s%*[ \t]%n",
                        arg_str, op, &cnt)) != 2 || op[0] != '=') {
    printf("*linep(%d):'%.80s',sec:%.10s,arg:'%.10s',\n"
           "val:'%s',cnt=%d,cnt2=%d,arg_cnt=%d\n",
           __LINE__, linep, sec_str, arg_str, val_str, cnt, cnt2, arg_cnt);
  } else {
    printf("*linep(%d):'%.80s',sec:%.10s,arg:'%.10s',\n"
           "val:'%s',cnt=%d,cnt2=%d,arg_cnt=%d\n",
           __LINE__, linep, sec_str, arg_str, val_str, cnt, cnt2, arg_cnt);
  }
}
void loop() {
}

############################################### Here is my serial output:

linep(32):lamp: name=living,sec_str:'lamp',op:':',cnt=5 linep(44):' name=living',arg_cnt=2,arg_str:'name',op='=',cnt=5 *linep(54):'=living',sec:lamp,arg:'', val:'',cnt=-1,cnt2=0,arg_cnt=0


You can see on "line(44)" output cnt was assigned '5' and it should have been assigned '6'. I have other case where it assigns '1' or '0' after consuming 6 characters in the string. It seems code location dependent on how it fails. Here is a line out of my code that causes exception (9) unaligned assignment error:

if((arg_cnt=sscanf(linep,"%[ \t,]%20[^ =\t] %1[=]%n", arg_str,op,&cnt))!=2 || op[0]!='='){ ...} I've tried variations in the format string that cause improper assignment to the %n arg or exception 9s: 1) "%[ \t,]%20[^ =\t] %1s%[ \t]%n" cause exception 9 2) "%[ \t,]%20[^ =\t]%n %1[=]%n" two %n args causes exception 9 Yes I give it 2 integer pointers in the proper order. 3) "%*[ \t,]%20[^ =\t] %1s%n" Bad consumed char count of '1'

Here is the exception decode: ###################################################


Exception 9: LoadStoreAlignmentCause: Load or store to an unaligned address
PC: 0x402130c5: __ssvfscanf_r at /home/earle/src/esp-quick-toolchain/repo/newlib/newlib/libc/stdio/nano-vfscanf.c line 423
EXCVADDR: 0x3ffffef6

Decoding stack results
0x40213150: __ssvfscanf_r at /home/earle/src/esp-quick-toolchain/repo/newlib/newlib/libc/stdio/nano-vfscanf.c line 467
0x40100e35: interrupt_handler(void*) at C:\Users\tvonderhaar\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.7.4\cores\esp8266\core_esp8266_wiring_digital.cpp line 173
0x40100a64: ets_post(uint8, ETSSignal, ETSParam) at C:\Users\tvonderhaar\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.7.4\cores\esp8266\core_esp8266_main.cpp line 177
0x40100d68: interrupt_handler(void*) at C:\Users\tvonderhaar\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.7.4\cores\esp8266\core_esp8266_wiring_digital.cpp line 135
0x40100d60: interrupt_handler(void*) at C:\Users\tvonderhaar\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.7.4\cores\esp8266\core_esp8266_wiring_digital.cpp line 135
0x40100a64: ets_post(uint8, ETSSignal, ETSParam) at C:\Users\tvonderhaar\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.7.4\cores\esp8266\core_esp8266_main.cpp line 177
0x40100c3c: millis() at C:\Users\tvonderhaar\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.7.4\cores\esp8266\core_esp8266_wiring.cpp line 188
0x40100c3c: millis() at C:\Users\tvonderhaar\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.7.4\cores\esp8266\core_esp8266_wiring.cpp line 188
0x40100642: uart_write_tmo(uart_*, char*, u_short, l_time) at C:\Users\TVONDE~1\AppData\Local\Temp\arduino_build_474964\sketch\uart_esp8266.cpp line 1363
0x40212ce0: _sungetc_r at /home/earle/src/esp-quick-toolchain/repo/newlib/newlib/libc/stdio/nano-vfscanf.c line 221
0x40212d70: __ssrefill_r at /home/earle/src/esp-quick-toolchain/repo/newlib/newlib/libc/stdio/nano-vfscanf.c line 269
0x40100c3c: millis() at C:\Users\tvonderhaar\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.7.4\cores\esp8266\core_esp8266_wiring.cpp line 188
0x4020ef9c: sscanf at /home/earle/src/esp-quick-toolchain/repo/newlib/newlib/libc/stdio/sscanf.c line 431
0x40100c3c: millis() at C:\Users\tvonderhaar\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.7.4\cores\esp8266\core_esp8266_wiring.cpp line 188
0x4020f058: __seofread at /home/earle/src/esp-quick-toolchain/repo/newlib/newlib/libc/stdio/stdio.c line 73
0x4020a94a: __yield() at C:\Users\tvonderhaar\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.7.4\cores\esp8266\core_esp8266_main.cpp line 112
0x40208802: uart_read_tmo(uart_*, char*, u_short, l_time) at C:\Users\TVONDE~1\AppData\Local\Temp\arduino_build_474964\sketch\uart_esp8266.cpp line 1483
0x40206540: CONFIG::read_cfg(unsigned char) at C:\Users\TVONDE~1\AppData\Local\Temp\arduino_build_474964\sketch\conf.cpp line 294
0x40201e02: lfs_dir_find at C:\Users\tvonderhaar\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.7.4\libraries\LittleFS\src\../lib/littlefs/lfs.c line 1167
0x4020985f: littlefs_impl::LittleFSFileImpl::LittleFSFileImpl(littlefs_impl::LittleFSImpl*, char const*, std::shared_ptr , int, long) at c:\users\tvonderhaar\appdata\local\arduino15\packages\esp8266\tools\xtensa-lx106-elf-gcc\2.5.0-4-b40a506\xtensa-lx106-elf\include\c++\4.8.2\ext/new_allocator.h line 104
0x40101388: malloc(size_t) at C:\Users\tvonderhaar\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.7.4\cores\esp8266\umm_malloc\umm_malloc.cpp line 552
0x402097fc: std::__shared_ptr ::operator=(std::__shared_ptr &&) at c:\users\tvonderhaar\appdata\local\arduino15\packages\esp8266\tools\xtensa-lx106-elf-gcc\2.5.0-4-b40a506\xtensa-lx106-elf\include\c++\4.8.2\bits/shared_ptr_base.h line 862
0x4020a80c: operator new(unsigned int) at C:\Users\tvonderhaar\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.7.4\cores\esp8266\abi.cpp line 39
0x40209a89: littlefs_impl::LittleFSImpl::open(char const*, fs::OpenMode, fs::AccessMode) at c:\users\tvonderhaar\appdata\local\arduino15\packages\esp8266\tools\xtensa-lx106-elf-gcc\2.5.0-4-b40a506\xtensa-lx106-elf\include\c++\4.8.2\ext/new_allocator.h line 120
0x40101388: malloc(size_t) at C:\Users\tvonderhaar\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.7.4\cores\esp8266\umm_malloc\umm_malloc.cpp line 552
0x4020a80c: operator new(unsigned int) at C:\Users\tvonderhaar\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.7.4\cores\esp8266\abi.cpp line 39
0x40209ac6: littlefs_impl::LittleFSImpl::open(char const*, fs::OpenMode, fs::AccessMode) at c:\users\tvonderhaar\appdata\local\arduino15\packages\esp8266\tools\xtensa-lx106-elf-gcc\2.5.0-4-b40a506\xtensa-lx106-elf\include\c++\4.8.2\bits/shared_ptr_base.h line 781
0x40202d18: lfs_stat at C:\Users\tvonderhaar\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.7.4\libraries\LittleFS\src\../lib/littlefs/lfs.c line 3146
0x40206bfd: fs::File::operator=(fs::File&&) at C:\Users\tvonderhaar\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.7.4\cores\esp8266/FS.h line 52
0x40207163: CONFIG::setup() at C:\Users\TVONDE~1\AppData\Local\Temp\arduino_build_474964\sketch\conf.cpp line 75
0x402075bc: setup() at C:\home\esp8266\Dimmer Project\sketch_esp8266_dimmer/sketch_esp8266_dimmer.ino line 31
0x4020aa20: loop_wrapper() at C:\Users\tvonderhaar\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.7.4\cores\esp8266\core_esp8266_main.cpp line 194

###################################################

I tried to find the source code for nano-vfscanf() but it does not exist in the master board type distribution nor the git distribution. I found some source from a couple of newlib google searchs, but no line 423 in any of them did an assingment that would cause this error.

There really should be documentation on how to find the exact source used by this 2.7.4 distribution. Locally its only obj files in a library. Interestingly the exception decoder somehow knows the source code line number of the exception. Objects files in libc.a libraries might not be stripped and contain that info.

I'm really stuck and will have to rebuild this wheel without some help. Not looking forward to that.

Sincerely, Tom

earlephilhower commented 3 years ago

If I read this right, you seem to have 2 issues, an exception and sscanf format not doing as you expect.

The exception is an unaligned read or write, meaning sscanf was asked to read an int/long into a non-32b aligned address, or a short into an odd address. The address attempted to be read/written is reported in EXCVADDR as 0x3ffffef6, and is deep within the SYS context local variables, so you have something very wrong with the app/state when that happens. If you're doing non-IRQ safe calls in an IRQ (since interrupt* are showing in the stack dump) or have a bad pointer, that would be one. W/o an MCVE, there's nothing we can do on our end, and I wouldn't bother looking at newlib sources because scanf/printf are pretty well exercised by every app out there. If you're still inclined, though, the sources are in https://github.com/earlephilhower/newlib-xtensa/blob/xtensa-2_2_0-lock-arduino/newlib/libc/stdio/nano-vfscanf.c

As for the scanf string, it is possible that newlib doesn't support a feature you're expecting. It's not as comprehensive as glibc (hence the nano prefix) so your best bet there is to find a simpler/different way to parse the free form string if you're finding something amiss. Alternatively, if you have patches for newlib you can send them upstream or to us in the prior repo.

There's a PR w/a pull of newlib 4.0.0 upstream, #7708, if you want to see if it's something added recently.

earlephilhower commented 3 years ago

I just ran your MCVE on chip and on my Linux system. The 2 match, so I think your 2nd issue isn't really an issue.

ESP8266

*linep(32):lamp: name=living,sec_str:'lamp',op:':',cnt=5
*linep(44):' name=living',arg_cnt=2,arg_str:'name',op='=',cnt=5
*linep(54):'=living',sec:lamp,arg:'',
val:'',cnt=-1,cnt2=0,arg_cnt=0

Linux

$ /tmp/a
*linep(32):lamp: name=living,sec_str:'lamp',op:':',cnt=5
*linep(44):' name=living',arg_cnt=2,arg_str:'name',op='=',cnt=5
*linep(54):'=living',sec:lamp,arg:'',
val:'',cnt=-1,cnt2=0,arg_cnt=0
tvonderhaar commented 3 years ago

I found my exception (9) problem. It's real subtle. I peppered the code with printfs printing the addresss of the arg causing the exception. Look at this code snippet and try to guess my error.

void func(void){
int char_cnt;

      for(bool more_args=TRUE,char_cnt=0;more_args;){
            sscanf("%n",&char_cnt); //Exception(9) here.
     }

I found this type of error causing problems last night in another piece of my code. The variable char_cnt is not just being set to '0' in the "for(;;)" introduction, it creates a variable local to the for statement of type bool. That type can and did align to a non 4 byte boundary. My error was thinking the for introduction created a local bool more_args and set the function scope variable char_cnt to 0. The comma operator in this context is not a serial assignment operator but a declaration operator seeing multiple variables of the same type. I don't think you can perform local variable creation and function scope initialization in the for statement introduction clause. This obviously is not a proper way.

On the second problem. I looked at your analysis and am a bit confused by your conclusion. Firstly, the return count from sscan() using the %n format specifier is supposed to be the count of the characters consumed by the sscanf() so far. Iooking at my code example and my output, two things are certain: 1) sscanf() consumes the char '=' in the input string ' name=living' as reflected in the output: *linep(44):' name=living',arg_cnt=2,arg_str:'name',op='=',cnt=5 op being set to '=' verifys sscanf() consumed it. 2) The '=' is the sixth charater in the string and therefor cnt should have been set to 6 and not 5. The fact that the Linux sscanf() function also improperly sets the cnt varaible to 5 instead of 6 does not verify that the function is working as expected but, that the source for both the ESP8266 and the Linux are both faulty and should be fixed. Don't you think?

earlephilhower commented 3 years ago

Changing the 30-yr old POSIX definition of %n is a bit above our pay grades here. Since there's no core issue, closing.

tvonderhaar commented 3 years ago

I guess when you wrote there was no core issue and implied I was trying to get you to change the POSIX specification, that was not very civil of you. I honestly believe I found a bug because sscanf() does not follow the 2001 POSIX specification found on the Linux page googled on the WEB. That page indicates at the bottom it conforms to POSIX.1-2001 and the behavior that both of us found did not meet the description of sscanf() on that page. You could have just reminded me that someone else maintains the source code. Don't worry I won't be asking for your help again.