svaarala / duktape

Duktape - embeddable Javascript engine with a focus on portability and compact footprint
MIT License
5.89k stars 514 forks source link

Recursive Fibonacci extremely slow on TI-Nspire #183

Closed Legimet closed 7 years ago

Legimet commented 9 years ago

The recursive Fibonacci algorithm is very slow on the Nspire. I am using Duktape 1.2.1, and compiling with -O3.

Here is a minimal working example:

#include "duktape.h"

int main(void) {
  duk_context *ctx = duk_create_heap_default();
  duk_eval_string(ctx, "function f(n){if (n<=1) return n; else return f(n-1)+f(n-2)}print(f(30));");
  duk_destroy_heap(ctx);
  return 0;
}

Computing f(30) takes about 57 seconds. For comparison, here is a graph of computation times for various other languages available on this calculator: https://i.imgur.com/fBPlXr1.png. (We are using recursive Fibonacci as a benchmark.)

svaarala commented 9 years ago

It's a known issue and can be seen here too:

There are several reasons for this. For example, self reference (function fib() { ... } refers to fib) causes an explicit scope object to be created, and returning involves a longjmp. These weren't yet addressed in #138.

Legimet commented 9 years ago

Thanks. In that case, I won't worry about it now.

svaarala commented 8 years ago

Call handling optimization will happen in 1.4 so moving there.

michael-dwan commented 8 years ago

Does this apply to performance of simple polymorphic classes, as well? I've found the performance of entity lists (in the context of games, or game like applications,) to be a bit on the slow side. I have a pretty dumb example below, generated from CoffeeScript (using its inheritance mechanism). Basically, it does some work up front to create a list of entities, and then those entities are looped over having methods and overloaded methods called. My testing is actually on Android, so, for instance, that run() function is called from Java via JNI, a hundred times or so, after the script is compiled and init has been called. Just the run is being timed, though. Apologies for not having a fully working test case, the Android JNI makes it hairy to package that up for viewing. I'll trying get a simple C case if you like, that can run on desktop.

Sorry for the code dump, just super curious if this is related. Definitely want to use DukTape for this project (its been really fantastic, by the way,) but the performance here might be slightly limiting

// Generated by CoffeeScript 1.9.1
var Entities, Entity, Rabbit, Skunk, Spook, _create_main_object, init, update,
  extend = function(child, parent) { for (var key in parent) { if (hasProp.call(parent, key)) child[key] = parent[key]; } function ctor() { this.constructor = child; } ctor.prototype = parent.prototype; child.prototype = new ctor(); child.__super__ = parent.prototype; return child; },
  hasProp = {}.hasOwnProperty;

Entity = (function() {
  function Entity() {
    this.init_object();
    this.init();
  }

  Entity.prototype.act = function() {};

  Entity.prototype.ax = function() {
    return 0.0;
  };

  Entity.prototype.ay = function() {
    return 0.0;
  };

  Entity.prototype.update = function() {
    this.x += this.dx;
    this.y += this.dy;
    this.dx += this.ax();
    return this.dy += this.ay();
  };

  Entity.prototype.init_object = function() {
    this.x = 0.0;
    this.y = 0.0;
    this.dx = 1.0;
    this.dy = 1.0;
    return this;
  };

  Entity.prototype.init = function() {};

  return Entity;

})();

Entities = (function() {
  function Entities() {
    this.init_object();
    this.init();
  }

  Entities.prototype.init = function() {
    var k, results;
    results = [];
    for (k = 1; k <= 10000; k += 1) {
      switch (Math.random()*3 | 0) {
        case 0:
          results.push(this.entities.push(new Rabbit()));
          break;
        case 1:
          results.push(this.entities.push(new Skunk()));
          break;
        default:
          results.push(this.entities.push(new Spook()));
      }
    }
    return results;
  };

  Entities.prototype.update = function() {
    var k, len1, ref;
    ref = this.entities;
    for (k = 0, len1 = ref.length; k < len1; k++) {
      ref[k].update();
    }
  };

  Entities.prototype.init_object = function() {
    this.entities = [];
    return this;
  };

  return Entities;

})();

Rabbit = (function(superClass) {
  extend(Rabbit, superClass);

  function Rabbit() {
    this.init_object();
    this.init();
  }

  Rabbit.prototype.act = function() {
    return ++this.hops;
  };

  Rabbit.prototype.ax = function() {
    return 3.0;
  };

  Rabbit.prototype.ay = function() {
    return 3.0;
  };

  Rabbit.prototype.init_object = function() {
    Rabbit.__super__.init_object.call(this);
    this.hops = 0;
    return this;
  };

  return Rabbit;

})(Entity);

Skunk = (function(superClass) {
  extend(Skunk, superClass);

  function Skunk() {
    this.init_object();
    this.init();
  }

  Skunk.prototype.act = function() {
    this.scent += 0.03;
    if (this.scent > 1.0) {
      return this.scent = 1.0;
    }
  };

  Skunk.prototype.ax = function() {
    return 1.0;
  };

  Skunk.prototype.ay = function() {
    return 1.0;
  };

  Skunk.prototype.init_object = function() {
    Skunk.__super__.init_object.call(this);
    this.scent = 0.0;
    return this;
  };

  return Skunk;

})(Entity);

Spook = (function(superClass) {
  extend(Spook, superClass);

  function Spook() {
    this.init_object();
    this.init();
  }

  Spook.prototype.act = function() {
    if (this.psychokinetic_energy) {
      return --this.psychokinetic_energy;
    }
  };

  Spook.prototype.ax = function() {
    return 2.0;
  };

  Spook.prototype.ay = function() {
    return 2.0;
  };

  Spook.prototype.init_object = function() {
    Spook.__super__.init_object.call(this);
    this.psychokinetic_energy = 100;
    return this;
  };

  return Spook;

})(Entity);

_create_main_object = function() {
  this._main_object = new Entities();
  return this._main_object;
};

init = function() {
  return _create_main_object();
};

run = function() {
  return _main_object.update();
};
svaarala commented 8 years ago

Call handling in general is a bit slow and there hasn't yet been an effort to improve it so it affects pretty much all calls. There should be low hanging fruits there though.

svaarala commented 8 years ago

There are some call handling improvements in 1.4.0, but I didn't have time to implement all planned improvements for 1.4.0 so call performance work continues in 1.5.0.

The Fibonacci example above is bottlenecked by a slow path GETVAR lookup for the function self reference, see #401. Concretely:

// test1.js - takes about 3.0 sec in master (3.8 sec in 1.3.0)
function f(n){if (n<=1) return n; else return f(n-1)+f(n-2)};print(f(33));

By using a function expression the self reference is faster:

// test2.js - takes about 1.9 sec in master (2.3 sec in 1.3.0)
var f = function f(n){if (n<=1) return n; else return f(n-1)+f(n-2)};print(f(33));
fatcerberus commented 8 years ago

Since the bottleneck is the GETVAR, could you improve the performance of the example by wrapping the whole thing in a self-executing function and binding f to a register?

svaarala commented 8 years ago

Yes, performance is better if you avoid the GETVAR. But it's still important for the external function declaration to work well, because users probably expect that :)

Duktape's current scope handling is pretty simplistic: it uses ordinary objects to keep variables, with the internal prototype chain providing the scope nesting reference. Optimizing property performance will improve the scope performance too. Much better gains would be possible by making scopes separate internal objects, with accessors optimized to work with them. The downside of this is a larger code footprint and a lot of trivial changes here and there (e.g. in garbage collection), so I'm not yet sure which path is best. I'd at least want to take a couple of stabs at the property performance before making such a change.

svaarala commented 7 years ago

Closing this issue because there's no specific change to be made - call handling performance is monitored in master and, like other performance stuff, it's being improved slowly over time.