Closed tvonderhaar closed 3 years ago
If I read this right, you seem to have 2 issues, an exception and sscanf format not doing as you expect.
The exception is an unaligned read or write, meaning sscanf was asked to read an int/long into a non-32b aligned address, or a short into an odd address. The address attempted to be read/written is reported in EXCVADDR as 0x3ffffef6, and is deep within the SYS context local variables, so you have something very wrong with the app/state when that happens. If you're doing non-IRQ safe calls in an IRQ (since interrupt* are showing in the stack dump) or have a bad pointer, that would be one. W/o an MCVE, there's nothing we can do on our end, and I wouldn't bother looking at newlib sources because scanf/printf are pretty well exercised by every app out there. If you're still inclined, though, the sources are in https://github.com/earlephilhower/newlib-xtensa/blob/xtensa-2_2_0-lock-arduino/newlib/libc/stdio/nano-vfscanf.c
As for the scanf string, it is possible that newlib doesn't support a feature you're expecting. It's not as comprehensive as glibc (hence the nano prefix) so your best bet there is to find a simpler/different way to parse the free form string if you're finding something amiss. Alternatively, if you have patches for newlib you can send them upstream or to us in the prior repo.
There's a PR w/a pull of newlib 4.0.0 upstream, #7708, if you want to see if it's something added recently.
I just ran your MCVE on chip and on my Linux system. The 2 match, so I think your 2nd issue isn't really an issue.
ESP8266
*linep(32):lamp: name=living,sec_str:'lamp',op:':',cnt=5
*linep(44):' name=living',arg_cnt=2,arg_str:'name',op='=',cnt=5
*linep(54):'=living',sec:lamp,arg:'',
val:'',cnt=-1,cnt2=0,arg_cnt=0
Linux
$ /tmp/a
*linep(32):lamp: name=living,sec_str:'lamp',op:':',cnt=5
*linep(44):' name=living',arg_cnt=2,arg_str:'name',op='=',cnt=5
*linep(54):'=living',sec:lamp,arg:'',
val:'',cnt=-1,cnt2=0,arg_cnt=0
I found my exception (9) problem. It's real subtle. I peppered the code with printfs printing the addresss of the arg causing the exception. Look at this code snippet and try to guess my error.
void func(void){
int char_cnt;
for(bool more_args=TRUE,char_cnt=0;more_args;){
sscanf("%n",&char_cnt); //Exception(9) here.
}
I found this type of error causing problems last night in another piece of my code. The variable char_cnt is not just being set to '0' in the "for(;;)" introduction, it creates a variable local to the for statement of type bool. That type can and did align to a non 4 byte boundary. My error was thinking the for introduction created a local bool more_args and set the function scope variable char_cnt to 0. The comma operator in this context is not a serial assignment operator but a declaration operator seeing multiple variables of the same type. I don't think you can perform local variable creation and function scope initialization in the for statement introduction clause. This obviously is not a proper way.
On the second problem. I looked at your analysis and am a bit confused by your conclusion. Firstly, the return count from sscan() using the %n format specifier is supposed to be the count of the characters consumed by the sscanf() so far. Iooking at my code example and my output, two things are certain: 1) sscanf() consumes the char '=' in the input string ' name=living' as reflected in the output: *linep(44):' name=living',arg_cnt=2,arg_str:'name',op='=',cnt=5 op being set to '=' verifys sscanf() consumed it. 2) The '=' is the sixth charater in the string and therefor cnt should have been set to 6 and not 5. The fact that the Linux sscanf() function also improperly sets the cnt varaible to 5 instead of 6 does not verify that the function is working as expected but, that the source for both the ESP8266 and the Linux are both faulty and should be fixed. Don't you think?
Changing the 30-yr old POSIX definition of %n
is a bit above our pay grades here. Since there's no core issue, closing.
I guess when you wrote there was no core issue and implied I was trying to get you to change the POSIX specification, that was not very civil of you. I honestly believe I found a bug because sscanf() does not follow the 2001 POSIX specification found on the Linux page googled on the WEB. That page indicates at the bottom it conforms to POSIX.1-2001 and the behavior that both of us found did not meet the description of sscanf() on that page. You could have just reminded me that someone else maintains the source code. Don't worry I won't be asking for your help again.
Basic Infos
Platform
Settings in IDE
Problem Description
sscanf() is both improperly assigning %n format argument value and in a section of fairly complex code it cause a exception (9) defined as unaligned memory read or write.
Here is a small code segment that exhibits the improper consumed character count being assigned:
################################################
############################################### Here is my serial output:
linep(32):lamp: name=living,sec_str:'lamp',op:':',cnt=5 linep(44):' name=living',arg_cnt=2,arg_str:'name',op='=',cnt=5 *linep(54):'=living',sec:lamp,arg:'', val:'',cnt=-1,cnt2=0,arg_cnt=0
You can see on "line(44)" output cnt was assigned '5' and it should have been assigned '6'. I have other case where it assigns '1' or '0' after consuming 6 characters in the string. It seems code location dependent on how it fails. Here is a line out of my code that causes exception (9) unaligned assignment error:
if((arg_cnt=sscanf(linep,"%[ \t,]%20[^ =\t] %1[=]%n", arg_str,op,&cnt))!=2 || op[0]!='='){ ...} I've tried variations in the format string that cause improper assignment to the %n arg or exception 9s: 1) "%[ \t,]%20[^ =\t] %1s%[ \t]%n" cause exception 9 2) "%[ \t,]%20[^ =\t]%n %1[=]%n" two %n args causes exception 9 Yes I give it 2 integer pointers in the proper order. 3) "%*[ \t,]%20[^ =\t] %1s%n" Bad consumed char count of '1'
Here is the exception decode: ###################################################
###################################################
I tried to find the source code for nano-vfscanf() but it does not exist in the master board type distribution nor the git distribution. I found some source from a couple of newlib google searchs, but no line 423 in any of them did an assingment that would cause this error.
There really should be documentation on how to find the exact source used by this 2.7.4 distribution. Locally its only obj files in a library. Interestingly the exception decoder somehow knows the source code line number of the exception. Objects files in libc.a libraries might not be stripped and contain that info.
I'm really stuck and will have to rebuild this wheel without some help. Not looking forward to that.
Sincerely, Tom