wren-lang / wren

The Wren Programming Language. Wren is a small, fast, class-based concurrent scripting language.
http://wren.io
MIT License
6.9k stars 552 forks source link

Num.fromString(" ") returns 0 #1131

Open Jackie365 opened 1 year ago

Jackie365 commented 1 year ago

Hi, I'm using wren v0.4.0 and I find

//whitespace
Num.fromString(" ") == 0

Should it be null ?

minirop commented 1 year ago

there probably should be a check to see if end == string->value (meaning nothing has been consumed) before skipping the whitespaces.

https://github.com/wren-lang/wren/blob/a4ae90538445a4f88dc965e9f11c768ae903ff0d/src/vm/wren_core.c#L619-L622

mhermier commented 1 year ago

I don't know why the string is striped here, but it doesn't seems the sensible way to do. Instead, a strip method should be provided instead and let the user decide to strip or not the string before parsing to number.

ruby0x1 commented 1 year ago

[Trim](https://wren.io/modules/core/string.html#trim()) is user space already fwiw

PureFox48 commented 1 year ago

Considering how often Num.fromString is used, it's surprising this bug hasn't been noticed before. Perhaps people had thought that, since strtod returns 0 for invalid numeric strings, this was the expected behavior.

Anyway, on the face of it, it seems (as @minirop said) that all we need to do is to check whether any part of the string has been consumed by strtod and return NULL if it hasn't.

If we do that, there should be no need to treat an empty string as a special case or to skip past any trailing whitespace.

So that would give us:

DEF_PRIMITIVE(num_fromString)
{
  if (!validateString(vm, args[1], "Argument")) return false;

  ObjString* string = AS_STRING(args[1]);

  errno = 0;
  char* end;
  double number = strtod(string->value, &end);

  if (errno == ERANGE) RETURN_ERROR("Number literal is too large.");

  // Check we have consumed any part of the string. Otherwise, it contains non-number
  // characters and we can't parse it.
  if (end == string->value) RETURN_NULL;

  RETURN_NUM(number);
}
CrazyInfin8 commented 1 year ago

Some time ago, I tried removing our dependency on the standard strtod and strtoll functions with #984.

https://github.com/wren-lang/wren/blob/3f5a16a78e9b82cca93f583b416f2a5a6cecca1d/src/vm/wren_utils.c#L232-L417

It certainly is a very extreme solution to this specific problem but thought it could be more intuitive this way and get rid of a few weird quirks of strtoXX.

With wrenParseNum we have easier access to different error messages than strtoXX, for example: if there were no digits when parsing. fromString currently discards this information but I find it could still be helpful for the compiler or if someone else wanted to use wrenParseNum in their host C code.


I think the biggest issue of #984 is that the results of wrenParseNum may be less accurate to strtoXX as the double's exponents gets too high or too low.

PureFox48 commented 1 year ago

Yes, I remember that PR.

If we introduce octal (0oxxx) and binary (0bxxx) literals, which I think we should, then strtod will always return 0.0 when passed those literals in string form even though it does currently deal properly with hexadecimal strings. For consistency, we'd therefore need something like your solution to deal with all three.