wolfgangj / bone-lisp

The Bone Lisp programming language
Other
321 stars 21 forks source link

initial support for floating-point numbers #12

Closed dubek closed 8 years ago

dubek commented 8 years ago

Support reading floating-point numbers, printing them and num->str support. Internally, the C float is stored in the high 32-bits of an any variable.

Added core functions integer? and float?.

Still missing support of:

NOTE: I wouldn't merge it as-is because floats are quite useless without the arithmetic and comparison. I'm not sure which approach should be taken:

  1. Should + work both for int, float or mixed?
  2. Should (=? 123 123.0) be true?
  3. Should we introduce new functions float=? , float+ , etc?
  4. Should we add int->float and round ceil floor ?
wolfgangj commented 8 years ago

Re approach:

I think everyone will get it wrong all the time if + does not work for both ints and floats, so it should work for both. It should be optimized for using ints since that is the common case. If it cannot be done without losing performance for int-only operations, it might be a good idea to also have int+ etc. (By the way, integer? is a rather long name, Bone generally keeps names short, so int? would be better.)

int->float can then be trivially done in the prelude as | x (+ x 0.0), but I'm not even sure if this is ever needed; I can't remember ever needing it. round ceil floor on the other hand are pretty important. Maybe round should have the alias float->int?

Exponential notation can be done with strtod (from stdlib.h), so you don't have to write the code yourself. If you would like to do it yourself, I would suggest implementing it as a reader macro.

wolfgangj commented 8 years ago

Re code:

bprintf("#{?}"); should be an error instead, I think - if this ever happens, this is an internal error of the interpreter, so it seems unusual if it just prints a value and continues. Even just calling abort() would probably be fine.

Everything else looks good.

wolfgangj commented 8 years ago

How do you plan to implement / (division)? Will it always return a float? Return an int if all arguments are ints? Return an int whenever the result actually is an integer?

dubek commented 8 years ago

Hmmm. I think as follows:

What do you think?

wolfgangj commented 8 years ago

I think that I've been thinking about these kinds of questions for too long already. :)

The behaviour of mod is out of question, it should be as you describe.

Using / and int/ sounds okay in principle (except that int/ should also accept more than 2 arguments, because... why wouldn't it? The code for this even exists already). But I'm not entirely sure. Personally I would expect / to return an int if called with int arguments. This is also the behaviour in C, C++, Ruby, Emacs Lisp and Python2. It would also be consistent with the other subs, e.g. + returns an int if all its arguments are ints.

Python3 and Perl5 behave as you describe, so that seems to be considered acceptable, too.

dubek commented 8 years ago

I added commits adding arithmetic and comparison and round, floor, ceil, trunc (and float->int, int->float). I think this is what we discussed so far. I went with the C/C++/Ruby style for division - if you divide two (or more) ints you get integer division.

Should I refactor all the number-related tests from tests/base.bn to tests/numbers.bn ?

wolfgangj commented 8 years ago

It's fine to keep the tests for numeric subs in tests/base.bn.

The code seems fine for now (I'll look in more detail into it later - I really need to get some sleep first). But it has quite a bit of potential for optimization - the calls to is_list_of_ints make us visit each list twice, which is not necessary. You don't need to change this, we can always optimize these things later, and I guess there are places with larger optimization potential in the evaluator anyway.

wolfgangj commented 8 years ago

by the way, do you think this is ready for merging now?

dubek commented 8 years ago
  1. I think it is ready for merging.
  2. I still don't have parsing for exponent format (1.3E9, -4.5E-5). Can be added in the future. If we want to use strtof(3) we'll need some changes in chars2num.
  3. Re: optimization: as far as I understand, for any argument you'll need to (a) check whether it is int or float, and then (b) add to the total integer value or to the total float value. You can do it in one loop but it'll still be 2N operations. I think I don't understand your optimization suggestion.
wolfgangj commented 8 years ago
  1. Okay, great.
  2. Yes, that can wait.
  3. The expensive part is not checking and adding, but following the cdr-pointers. (That's why working with linked lists is generally considered slow.)
dubek commented 8 years ago

Good point about the linked list traversal. So is this better (only one foreach)?

DEFSUB(fullplus) {
  int64_t ires = 0;
  float fres = 0.0;
  bool all_ints = true;
  foreach(n, args[0])
    switch (get_num_type(x)) {
    case t_num_int:
      if(all_ints)
        ires += any2int(n);
      else
        fres += any2int(n);
      break;
    case t_num_float:
      if(all_ints)
        fres = ires + any2float(n);
      else
        fres += any2float(n);
      all_ints = false;
      break;
    default:
      abort();
    }
  last_value = all_ints ? int2any(ires) : float2any(fres);
}

Basically it sums up ints in ires until a float is encountered. Then it switches over to fres and sums all the remaining as floats.

wolfgangj commented 8 years ago

You can even avoid the conditional branch on the all_ints variable (though branch prediction might safe your day, so it might not make much of a difference). The following is untested code, but should illustrate the idea.

DEFSUB(fullplus) {
  int64_t ires = 0;  // as long as we encounter only ints, we operate in "int mode"
  foreach_cons(c, args[0]) {
    any n = far(c);
    switch(get_num_type(n)) {
    case t_num_int:
      ires += any2int(n);
      break;
    case t_num_float: {  // switching to "float mode"
      float fres = any2float(n)+(float)ires;
      foreach(x, fdr(c))
        switch(get_num_type(x)) {
        case t_num_int:
          fres += (float)any2int(x);
          break;
        case t_num_float:
          fres += any2float(x);
          break;
        default: abort();
        }
      last_value = float2any(fres);
      return;   // end of code for "float mode"
    }
    default: abort();
    }
  }
  last_value = int2any(ires);
}
dubek commented 8 years ago

I like it. The "float mode" can be even shorter with my anynum2float helper. I'll change the code (and solve the merge conflicts).

wolfgangj commented 8 years ago

Great, thanks.

dubek commented 8 years ago

OK I changed the full arithmetic operations (still local).

Things get nastier in the logical operators, this is what I have:

any full_num_gtp_floats(float first, any xs) {
  float f = first;
  foreach(x, xs) {
    float g = anynum2float(x);
    if(f <= g)
      return BFALSE;
    f = g;
  }
  return BTRUE;
}

DEFSUB(full_num_gtp) {
  last_value = BTRUE;
  if(is_nil(args[0]))
    return;
  any first_n = far(args[0]);
  switch (get_num_type(first_n)) {
  case t_num_int: {
    int64_t n = any2int(first_n);
    foreach_cons(c, fdr(args[0])) {
      any x = far(c);
      switch (get_num_type(x)) {
      case t_num_int: {
        int64_t m = any2int(x);
        if(n <= m) {
          last_value = BFALSE;
          return;
        }
        n = m;
        break;
      }
      case t_num_float:
        last_value = full_num_gtp_floats(n, c);
        break;
      default:
        abort();
      }
    }
    break;
  }
  case t_num_float: {
    last_value = full_num_gtp_floats(any2float(first_n), fdr(args[0]));
    break;
  }
  default:
    abort();
  }
}

The same thing should duplicated for >,<,>=,<=. We can create a preprocessor macro for these, but that's ain't a beauty too...

Alternatively, we can choose to ignore the optimization on 0, 1 and >2 arguments and just implement it in prelude.bn:

(defsub (gtp . xs)
  ">? for 0, 1 or >2 arguments"
  (with loop (lambda (xs)
               (if (single? xs)
                 #t
                 (if (>? (car xs) (cadr xs))
                   (loop (cdr xs))))))
    (if (nil? xs)
      #t
      (loop xs))))

(of course the name should be _full>? and it should call _fast>? internally.)

wolfgangj commented 8 years ago

The pragmatic answer is that not all operations have to optimized equally. It's not uncommon to do something like (apply + some-list), so it makes sense to optimize +. I think it's pretty rare to use >? with long lists. Sometimes it's convenient to use it with 3 arguments to check if some value is in a certain range, but 3 elements is not a long list. :) So if it gets too nasty, we don't have to do the fastest possible solution.

I agree that solving it with a preprocessor macro is not the best thing to do, because the uglyness makes people (including me) skip over the code most of the time instead of trying to understand it. Only code that with absolute certainty has no bugs is allowed to make people skip over it. ;)

Now choose whatever you think is most reasonable. :)

The docstring ">? for 0, 1 or >2 arguments" is actually incorrect: It's also for 2 arguments, e.g. I might do (apply >? xs) where xs happens to be (1 2). The 2-argument optimization is only used when >? appears in the first element of an evaluated list because that's the only place the macro expansion is looking at.

(Edit: removed mistaken comment about full_num_gtp_floats)

dubek commented 8 years ago

I chose to implement _full=, _full> etc in bone. Cut a lot of C code, probably lost some performance but these are rarely used as we said before.

wolfgangj commented 8 years ago

Great, I will merge this now. Just two minor things:

Edit: Removed another somment where I simply overlooked something.