GNUAspell / aspell

http://aspell.net
GNU Lesser General Public License v2.1
243 stars 53 forks source link

Patch: strtoi_c() bug fix #290

Open aspell-helper opened 19 years ago

aspell-helper commented 19 years ago

Jose Da Silva josedasilva\@sf created a patch on 2005-05-18 02:40:09 UTC (Orig. from https://sourceforge.net/p/aspell/patches/108)

These modifications should allow strtoi_c() to correctly
return a long value between -2billion and +2billion.

The current routine ignores overflow and will return an
incorrect value if a very large number is present, for
example: 12345678901234567890.

The current routine will not return a correct negative value as written right now.

Another problem is that there may not be a number present, but the current routine modifies endptr if
ascii spaces are found or if a sign -/+ value without a
number.

With the new suggestions, now endptr will only point to
correct endptr if a number is found, otherwise endptr points
remains at beginning of str.
This allows user to perform a test for errors:
value = strtoi_c(x, &y);
if (x == y) then error;

...or to retry test using a more powerful routine:
value = strtoi_c(x, &y);
if (x == y) then extralongvalue = strtoextralongi_c(x, &y);

The attached diff file is based on latest strtonum.cpp in
CVS (may17,2005)

--------suggested updated routine (below)-------
long strtoi_c(const char * npter, const char ** endptr) {

char * str = (char*)npter;
long num = 0;
char sign = 0;

*endptr = str;
while (asc_isspace(*str)) {
str++;
}
if (*str == '-' || *str == '+') {
sign = *(str++) == '-' ? -1 : 0;
}
while (*str >= '0' && *str <= '9') {
num = num * 10 + (long)(*str - '0');
if (num < 0) {
*endptr = (char*)npter;
return -1;
}
*endptr = ++str;
}
if (sign) num = -num;
return num;
}

aspell-helper commented 19 years ago

Jose Da Silva josedasilva\@sf commented on 2005-05-18 03:00:48 UTC

Logged In: YES user_id=1138138

x.c is a mini test routine to try out the various conditions without having to completely recompile aspell, for example: str1 < -2billion, str1 > +2billion, str1 == "\t\r +";

to compile: gcc -o x x.c

NOTE: I noticed that: long strtoi_c(const char * npter, char ** endptr) { appears to agree better with the compiler versus: long strtoi_c(const char * npter, const char ** endptr) {

aspell-helper commented 19 years ago

Jose Da Silva josedasilva\@sf commented on 2005-05-18 17:59:30 UTC

Logged In: YES user_id=1138138

Added "if" test to verify a number truly exists beyond -/+ so that testing for "\t\r +" will test as false.... basically added: + if (*str < '0' || *str > '9') + return -1;

Also submitted as "diff -u" instead of "diff -b -u"

aspell-helper commented 19 years ago

Kevin Atkinson kevina\@sf updated the issue on 2005-05-20 00:23:46 UTC

aspell-helper commented 17 years ago

Jose Da Silva josedasilva\@sf commented on 2007-10-20 05:41:31 UTC

Logged In: YES user_id=1138138 Originator: YES

bug 1204046 is for a function which only expects +ve values and should be closed or deleted if you want this strtoi_c() function to also accept negative values as well. That should close 2 patches (this one and 124046). If you want to keep 124046, this strtoi_c() function needs a method to return an error.

I noticed an error in the patch I sent, therefore I'm attaching a new patch. The new patch simply returns 0 if num overflows past 'long'. The patch is against 1.4 in CVS: http://cvs.savannah.gnu.org/viewvc/aspell/aspell/common/strtonum.cpp?revision=1.4&view=markup

File Added: strtonum.cpp.diff

[Note: #​1204046 = https://sourceforge.net/p/aspell/patches/109/ = #291]

aspell-helper commented 17 years ago

Kevin Atkinson kevina\@sf commented on 2007-10-20 06:05:04 UTC

Logged In: YES user_id=6591 Originator: NO

strtoi_c should follow the sementatics of strtol when the locale is "C" and the base is 10. (Not sure why I named it strtoi_c instead of strtol_c).

Please figure out what they are and send me a patch to that effect.

aspell-helper commented 17 years ago

Jose Da Silva josedasilva\@sf commented on 2007-10-20 08:11:49 UTC

Logged In: YES user_id=1138138 Originator: YES

strtoi_c should follow the sementatics of strtol when the locale is "C" and the base is 10.

May need more thought - eg. suppose you spell-check German while locale is French. New patch includes MIN & MAX, don't know how you want to convey error considering the calling routines don't check for errors yet.

(Not sure why I named it strtoi_c instead of strtol_c).

A lot of developers assumed that 32bit was going to stay - then came 64bit and now you need to check all your references to ints and use char, short, long or longlong. char sign should help for register-starved CPUs like pre-pentiums and embedded-type-CPUs.

http://www.cplusplus.com/reference/clibrary/cstdlib/strtol.html On success, the function returns the converted integral number as a long int value. If no valid conversion could be performed, a zero value is returned. If the correct value is out of the range of representable values, LONG_MAX or LONG_MIN is returned, an the global variable errno is set to ERANGE.

http://linux.about.com/library/cmd/blcmdl3_strtoll.htm appears to mention a EINVAL File Added: strtonum.cpp.diff

aspell-helper commented 17 years ago

Kevin Atkinson kevina\@sf commented on 2007-10-20 08:21:41 UTC

Logged In: YES user_id=6591 Originator: NO

May need more thought - eg. suppose you spell-check German while locale is French.

My version is suppose to be locale independent! Ie the default or "C" locale

When I said follow the same semantics of the strtol I meant it, thus use errno.

aspell-helper commented 17 years ago

Jose Da Silva josedasilva\@sf commented on 2007-10-22 05:06:07 UTC

diff -rub aspell-0.61 for strtoi_c() strtonum.cpp.diff

aspell-helper commented 17 years ago

Jose Da Silva josedasilva\@sf commented on 2007-10-22 05:06:09 UTC

Logged In: YES user_id=1138138 Originator: YES

My version is suppose to be locale independent! Ie the default or "C" locale

Okay, makes more sense. it looked somewhat tempting to simply do a call to strtol

When I said follow the same semantics of the strtol I meant it, thus use errno.

Some documentation definitions: http://www.opengroup.org/onlinepubs/009695399/basedefs/errno.h.html It appears EINVAL and ERANGE should be given.

I know if you pass parameters through the stack, it is thread safe, I'm not too sure to trust passing parameters via some other method, so modified routine to pass values through the stack .... strtoi_c(startptr,endptr,errno)

Looking at glib, strtol has "char** endptr" but not "const char** endptr". I modified the routine to also use char** as done in glib's strtol() function.

Compiles okay, but still needs to be tested and right now, not sure how, or have time to try yet. Let me know if looks okay - if you want to work with it /modify it, sure, go ahead.

You'll notice the test for out of range is divided in 2 parts. If you use char as a value range of 0...255 or -128...+127 then you see: (num < 0) checks positive values (0...127) and negative values (-127..0) but can't test for -128, to test -128, you need to do (num-1 < 0) for the negative range to include -128..-1 concept should work similar for "long int"

File Added: strtonum.cpp.diff

aspell-helper commented 17 years ago

Kevin Atkinson kevina\@sf commented on 2007-10-22 05:11:04 UTC

Logged In: YES user_id=6591 Originator: NO

Once Again: When I said follow the same semantics of the strtol I meant it

Using errno is thread safe, otherwise lots of other things will break.

aspell-helper commented 13 years ago

Kevin Atkinson kevina\@sf updated the issue on 2011-06-26 04:36:14 UTC