opal / opal

Ruby ♥︎ JavaScript
https://opalrb.com
MIT License
4.84k stars 330 forks source link

`method_missing` Doesn't Work #162

Closed Ajedi32 closed 11 years ago

Ajedi32 commented 11 years ago

Okay, now this is a really tricky one, i'm not even sure if it's possible to implement in Opal but I'll throw it out there anyway: Opal currently has no support for Ruby's method_missing method:

class Foo
  def method_missing(method)
    puts method
  end
end

bar = Foo.new
bar.no_method

(Try in your Browser)

adambeynon commented 11 years ago

method_missing has slowly been taken out of Opal as speed improvements in certain area has meant it was blocking other things. To support it, generated methods would have to check each methods existence on every method dispatch - this extra overhead is just not acceptable (for me at least).

Im more than happy to add method_missing support back as an opt-in feature, but until we can make use of proxies in javascript (a fair number of years away), im afraid we will be stuck with slow method_missing.

meh commented 11 years ago

I'd personally prefer having it even if it slows down method dispatching.

Ajedi32 commented 11 years ago

Agreed. At the very least it should be an optional feature. I'd love for Opal to get to the point where you can use it to compile just about any Ruby application into JavaScript without having to worry too much about Opal's limitations.

adambeynon commented 11 years ago

The other thing to consider would be the ease of debugging. The 'easiest' implementation would be inline method checking. Say compiling this method call:

object.foo 1, 2, 3

That currently compiles into:

object.$foo(1, 2, 3)

Nice, clean and simple... and debugging through code like that is really easy. To support method_missing, it would however compile into something more like:

(object.$foo || object.$method_missing).call(object, 1, 2, 3)

Or something like that anyway. It ends up quite ugly.

An alternative, would be to use method stubs (which is what Opal originally did). When you compile a file, opal's parser maintains a unique list of all methods called in that file. Opal then knows every potential method call, whether it is actually defined, or whether it would go through method missing.

Before your code gets run, opal can add every one of these methods onto the root Object class as a stub, which just passes those arguments onto method_missing. Therefore, if a method foo gets called, if that object has actually defined foo, then the method will get called as expected. If not, this stuf method will get called, which then falls back on method_missing.

This lets us keep the nicer code generation that Opal currently has, and just adds stubs like the following to your file:

RubyObject_prototype.$foo = function() {
  return this.$method_missing.call(this, 'foo', args...);
};

This could also be generated into a separate file "my_lib.method_missing.js", so if you dont want these extra method definitions (which wouldn't add too much overhead), simply just do not include it in your built app. If you want fast method_missing, then include this file.

This approach adds a very small overhead when loading the app, but then when running, method dispatches have no overhead at all.

A proof of concept might help too...

Ajedi32 commented 11 years ago

For the first option, I think a much cleaner syntax would be something like:

object._call_method('foo', 1, 2, 3)

And then _call_method could be defined like this:

_call_method = function() {
(this[('$' + arguments.shift())] || this.$method_missing).call(this, arguments)
}

I do like the potential speed boost you could get from the second option though (even though I'm not sure whether it's significant enough to be a major plus), and doing it that way you wouldn't have to change the method calling syntax at all, which is nice.

elia commented 11 years ago

With the stubs you can't catch dynamic calls (e.g. send("find_by_#{name}")) this alone seems enough to avoid it

What @Ajedi32 suggests is basically using #send all the time, surely it's slower but maybe it can work as an opt-in

adambeynon commented 11 years ago

Im not sure I see the problem with the dynamic calls part. But I'm happy to use the suggested approach above as an opt-in system. My big concern is that even for methods that do exist, your adding 2 extra function calls, a logic operator and a string concatenation. That is a lot of overhead for methods that do exist.

elia commented 11 years ago

About the dynamic calls what I meant was that you can't stub a method of which you don't know the name (for example if it is called with #send) of course you can build in send the method missing logic.

Maybe I'm missing something…

adambeynon commented 11 years ago

Ah I see, sorry. Yeah, #send will need the same logic.

mrbrdo commented 11 years ago

I think you are missing the point with performance. If I want performance I will use coffeescript (or just pure javascript) in any case. If I want real Ruby I will use Opal. But right now it's not -_- I really don't care how slow method dispatching would become, it's not like I am doing fibonacci. Not to mention how easy it would be to make nice proxy wrappers to native JS stuff with method_missing, and we can avoid the whole opal-jquery project all together (or reduce it to a couple of lines at least). For example for me opal-jquery is not very useful, because I already know all the method names jquery uses and the changes in opal-jquery just confuse me, besides it does not implement all the methods and who has time to write wrappers for every plugin anyway. It would be much easier to just have a proxy factory e.g. bridge(jQuery).find("#lala").text.

It's really nice how you were able to implement a very performant implementation, but it means not much if it's not real Ruby, when we already have Coffeescript which supports calling native JS stuff as normal and has probably better performance. So there is no point to worry about performance if you break the language because of it, then there is no point in the project.

adambeynon commented 11 years ago

As for the opal-jquery example, the problem is that some methods would still need wrappers, whether it is to convert passed nil values => null, or ruby hashes => plain javascript objects.

If we went with a full method_missing implementation, it would have to look something like this: https://gist.github.com/4368780.

// opal_send(my_object, 'foo', 1, 2, 3)
function opal_send(recv, method, arg1, arg2, arg3) {
  var method_table, method;

  if (recv == null) {
    // null/undefined cant hold a method table, so dynamically dispatch to it
    method_table = nil_method_table;
  }
  else {
    // opal objects have their method table stored on '$opal'
    method_table = recv.$opal;
  }

  if (!method_table) {
    // if there isnt a method table, then we are sending to a non opal object.
    // We could try and forward this as a normal javascript method call?
    throw new Error("can't send method to native object");
  }

  if (method = method_table[method]) {
    // if our mehtod is defined, call it
    return method(recv, method, arg1, arg2, arg3);
  }
  else {
    // method not defined, so method_missing
    return method_table.method_missing(recv, 'method_missing', method, arg1, arg2, arg3);
  }
}

We could inline it into the generated code, but thats the general idea. By using a dispatch function, there are a couple of if/else operations, a property access, and a function call in overhead.

Of course, opal will gain users if it had extra features like method_missing, but it would also lose some that wanted to develop for mobile apps where the overhead would be obvious. I guess this boils down to which approach Opal should take.

As you say, people who are concerned with performance, and are likely developing for mobile, would probably just use coffeescript anyway.

mrbrdo commented 11 years ago

Well you should think about Ruby, it never would become popular if it was optimized for performance. You are taking this language, that is optimized for developer happiness, and you are trying to optimize it for performance. I think it's a bad idea. Slight modifications, like not allowing overriding operators on Numeric, that can provide a significant performance increase and not really affect anyone, those make sense, and they do this in some interpreters, but I think you went too far. It would kinda make sense if there was no coffeescript, but I think coffeescript has/had a very similar idea like you described. And I think they realized that it doesn't make sense to break ruby so it would be faster on JS VM, that's why they did what they did. Since even the syntax is different (not so much, but enough), when you are going back and forth between ruby and coffeescript it's easier to keep track of the different semantics of both languages. I think in the majority of cases the performance is not that important to justify breaking the language. And when it is, you can always use javascript or coffeescript. You also have to think about how many method calls does a typical script really make, and if the performance deficit would really be noticeable. I can tell you a few things I noticed with opal that are really bothering me, and to be honest I will have to switch back to coffeescript until this is the state. But I really like the idea of using ruby on the client side, just not like this.

I think right now opal is in the niche that coffeescript already dominates (and is more adapted for it), and I think there is a lot more space for this to be useful if you support the full ruby spec, for a scripting language in browsers. For people who write interpreters/compilers it is easy to get lost in performance optimization. I think this was the true genious of Matz, to not care about that and to just write a language he would love to use, no matter how fast or slow it would be. This is my opinion, I understand that not everyone will agree. I definitely support what you are doing though, it's very cool what you were able to achieve.

mrbrdo commented 11 years ago

Btw could you go with something like this for method_missing? Seems like performance wouldn't suffer too much..

// method_missing handler implementation, returns function
window.$method_missing_handler = function (obj, method_name) {
  return function() {
    // use name + the arguments *magic* variable
    // call obj.$method_missing if user provided it
    // call methods something like obj['$' + method_name](args)
    // lookup class hierarchy if necessary
  }
}

// method call
(this.$methodname || $method_missing_handler(this, 'methodname'))(args);

It's only one if for each method call which shouldn't be that bad. Then you can also implement method_missing on native javascript objects to magically proxy calls to native functions (with argument translation from ruby to js), or you can make this functionality as a separate library. And any js object that is not a special type (array, string etc), just translate it to a ruby hash.

PS: for testing how much ruby you really support you could use for example the tests from mruby which are pretty simple https://github.com/mruby/mruby/tree/master/test/t

adambeynon commented 11 years ago

That is something very similar to what opal used before. The only downside of the technique was that when using a function call like that, the method will not get the right this value, so I was having to use call to fix it:

(this.$methodname || $method_missing_handler(this, 'methodname')).call(this, args);

I've dug up an old jsperf test to show what I think are the two best possibilities http://jsperf.com/opal-method-calls-vs-no-method-missing/3.

The gist of it is to use something like:

(recv.$method_name || $mm('method_name'))(recv, arg1, arg2)

The main change would be that the self value is always passed to the method, instead of the method using this. This works nicely until it comes to blocks, which would also need to take the self value as its first argument which makes accessing blocks from JS would require some sort of wrapper. Not really the end of the world, but it is an annoyance.

Another slight possibility is going back to using some sort of method table, where all methods are stored on a single .$m property. This means that making any js object into a ruby object, simply involves setting the .$m property. For example:

// javascript
window.$m = Opal.WindowClass.method_table;

It also means we can properly use subclasses of Array, which isnt currently possible, and making Ruby wrappers around classes can be made easier as we wont need to keep allocating wrappers... just check if its already wrapped with .$m. If not, wrap it. Some Native ruby class might be useful here too.

mrbrdo commented 11 years ago

Hm, are you sure the $method_missing won't get the right this value? The $method_missing_handler in my example calls obj.$method_missing so it seems to me that the $method_missing function itself should get the right this value (=obj)? I'm not an expert in JS so I may be wrong. But even if the value of this would be wrong, you can always call obj.$method_missing.call(obj, args) in the handler so you still don't need to use .call for calling "normal" methods. Maybe you missed the fact that the handler I gave as example returns a closure, so this closure has access to params that were given to the handler (namely the target object) + it can be called normally with same args as a normal function.

But anyway, if the worst-case solution you came up with is only 50% slower I wouldn't call that bad at all, after all you have to consider that real-world programs don't only call methods in a loop, so 50% slower method calling is probably barely noticeable. And also I feel anyway that speed is not that important, as long as it is in some reasonable margins.

For blocks (or more general, procs, block is also a proc closure) you don't need to worry here, because if you parse properly the ruby code, you know when you are calling a method or when you are calling a proc. I can speak from experience with mruby interpreter, that it has a totally separate opcode to call a proc/block. So you can special case for proc and just call it with .call(context...). I don't know if JS closures "save" the value of this in the same way as ruby does, as I am not an expert with JS. If I were to give an analogy from mruby to what you have, you would have a Proc class that has a call method. So when you do some_proc.call(args) in ruby code it would look something like this (or you can do it directly if you don't allow user to redefine Proc#call)

    (some_proc.$call || $mm...)(args)

    Proc.prototype.$call = function () {
      var self = // properly determine... in mruby it's stored in the Proc instance, e.g. this.prev_self (in mruby prev_self = proc->env->stack[0]), which is set when the proc is created from the proc that created it
      this.proc_func.apply(self, arguments)
    }

Like I said if you have interest to support the full ruby spec (which I think would be very smart), you could use tests from mruby (or from rubyspec, but I think it would be much easier with the ones from mruby, they are more simple). The source of mruby is also very readable compared to ruby mri. When looking at OP_LAMBDA you can see how all procs are created. I can tell you that all methods in mruby are procs (not closures), but I think this is not necessary to conform to the specs. And with lambda you have these options:

The return from block is not so trivial to implement if you need any help with that some time let me know, I did it already for my mruby to c parser.

adambeynon commented 11 years ago

The this problem was when the method existed. It comes down to this problem:

a = {};
a.foo = function() { alert(this); };

// this works as expected
(a.foo)()
// => a

// but having an expression as a receiver breaks the call
(a.foo || method_missing('foo'))()
// => window

But, either way, the main benefit of using this was that native methods could easily be aliased. Its not a huge problem either way.

mrbrdo commented 11 years ago

Ah I see, that's quirky. Well I don't see any obvious reason not to use this though, since it maps nicely to ruby (or so it seems). But I guess you would have to use .call then, or use an if (a.foo) .. else missing. I think if/else should be as fast as ||, but without that quirk.

adambeynon commented 11 years ago

So, I have been looking through this again, and the best way I can see to implement method_missing is by using the method stubs. It lets us keep using real this values for self, calling ruby methods from js is really easy and works exactly how you would expect, it makes debugging simple, and the generated code is easy to read through.

Calling methods has no overhead at all (a method call is compiled into 1 function call), so the only overhead is the startup time as it has to define all the method stubs. Initially, this will be done per file (stubs will be generated at the bottom of each file), but in the long run the plan would be that for production builds, you run all your code through the compiler so it can generate method stubs just once for the project, which will remove any duplicate stubs.

Any thoughts/oppositions/problems to this? If there aren't any problems, it should be very quick to implement as Opal did have this technique before.

mrbrdo commented 11 years ago

It sounds like that wouldn't work with send. I'm kind of in favor of supporting the full ruby spec, at least eventually... To me it's more useful to have real ruby and to call JS methods easily as well, than performance. Since if you don't support the ruby spec then you might as well call it something else since it's not really ruby.

adambeynon commented 11 years ago

Im not sure I follow how it wouldn't work with send? The logic in send just falls back to method_missing if the required method doesnt exist:

module Kernel
  def send(name, *args)
    %x{
      if (self[name]) {
        return self[name].apply(self, args);
      }

      return self.method_missing.apply(self, [name].concat(args));
    }
  end
end

Or something like that...

mrbrdo commented 11 years ago

Hm I'm not sure what I meant actually. I was thinking about methods with dynamically generated name. I didn't 100% understand how your stubs would work. Did you mean you would determine at basically compile time which methods will be called "statically" (without send, so just someobj.somemethod) and make stubs for it? The problem is you cannot know which object/class it will be called on during compile-time, and it can also change during runtime. So if you had 100 classes with 10 methods each you would have to make 1000 stubs for each object. E.g. if you have some local var "obj" you cannot know at compile time which class it will be (Array, String, Whatever...). If that's not what you meant please explain the stubs a little.

mrbrdo commented 11 years ago

Btw if that is what you meant, you might get away with (my javascript knowledge is not good enough to know if this is possible to do) defining all the possible methods on some base JS object that all ruby objects would be based on. And in all of them just raise no method error or something. then in actual ruby class objs override those with implementations (for those methods that this class actually does have). I'm really not sure if this would work though. But if that means that each of those objects will "copy" all the stubs from the base object (instead of delegate to it) then I think it can be a problem memory-wise.

adambeynon commented 11 years ago

Yeah, exactly that. The compiler keeps a list of every method call in your app, and adds a method stub onto the root prototype, which all ruby objects inherit from.

adambeynon commented 11 years ago

The only thing left to work out, is a good way to have some "production build" feature, which builds the opal runtime, all gem dependencies and your app code in one go, to stop duplication of these method stubs. But that is a minor detail that can be worked out later on.

mrbrdo commented 11 years ago

Yes, this sounds like a reasonable solution to me. Do you know how it works internally in e.g. V8? Will it have pointers for each object to all of those methods, or is it more similar to how it is in ruby MRI (some "inheritance" tree that is traversed). Even pointers can be considerable if you have e.g. 10000 method names, that would be probably 40 or 80 KB per object. Also it might be good to benchmark first if the number of methods has any effect on speed of javascript method dispatching. E.g. does obj.method call take the same amount of time if method is the only method or if there is 10k methods.

You will have to implement define_method also to add the dynamically defined method to the base object. If you would actually use define_method for normal method definitions also that should make it so you don't have to "pre-seed" the base object at the start of the script. I think adding the method stub to the base object only when it is actually defined (runtime moment when you .prototype.method = ...) actually makes a lot of sense.

mrbrdo commented 11 years ago

PS: I just looked at the number of classes and unique method names (public, protected, or private):

If I just open irb: number of classes: 319 number of unique method names: 376

Small to mid sized Rails project: number of classes: cca. 2500 number of unique method names: cca. 2500

So when it's not a problem having around that many unique method names then I think it's a viable solution. If each class has pointers on 64-bit system: 2500 * 2500 * 8B = 50MB Just to keep in perspective. Or in irb case: 319 * 376 * 8B = 960 KB

And an average class in rails case has 294 methods (kinda crazy hehe, but this is also inherited methods).

adambeynon commented 11 years ago

As an update of sorts, the stub methods become awkward as each file is getting compiled individually. Realistically, the best approach to supporting method_missing will be to have inline checks:

(recv.$foo || recv.$method_missing).call(recv, arg1, arg2, arg3)

Or something like that anyway. Obviously, not all files/libs will need method_missing support, so we can have an opt-in or opt-out trigger, either per file, or per project. A simple setting like the following would be enough:

Opal::Parser.support_method_missing = true/false

While I appreciate that opal isn't really ruby without method_missing, I think this will be an opt-in feature.

mrbrdo commented 11 years ago

I don't know... I've been using a lot of coffeescript since, and I don't see a place for a language that doesn't work like Ruby out of the box (but is on the other hand trying to be Ruby), since then it's easier for me to just use coffeescript instead of worrying about all the differences/incompatibilities. It's fun to play with but I don't think it will ever be widely adopted in that state. That's just my opinion and I still value the effort.

elia commented 11 years ago

:+1: for inline checks!

easy to understand, optional, and ready for source maps :)

mrbrdo commented 11 years ago

By the way the implementation you showed looks good to me otherwise, I would just prefer for Opal to work exactly like Ruby out of the box, and turning method missing on would be an opt-out feature instead of opt-in. Sensible defaults are important (Rails is a good example of this). I think most of the time people won't obsess about performance too much, but you can always do benchmarks with the "optimized" mode. Something like # opal: method_missing=off at the top of the file would be nice imo (same way as the encoding comment in MRI).

adambeynon commented 11 years ago

I like the suggestion of the comment for turning on/off. We had a similar idea before to turn math operators on/off, etc. Well, here is a commit to master which supports method_missing: e143249. With it turned on, all specs pass. There is a little bit of work to do in the runtime to get it working, but the basics are there for dispatching.

mrbrdo commented 11 years ago

Great! By the way, which specs are you running against?

adambeynon commented 11 years ago

The specs in the spec/ folder. They are taken from the rubyspec project. There are about 3000 assertions (should ==), in 1220 example groups. The plan is to stop cherry picking specs from rubyspec, and just run against rubyspec directly. Hopefully...

mrbrdo commented 11 years ago

Wow that's great! How much is missing before you can pass all the rubyspec (I'm assuming rubyspec doesn't include stuff from stdlib like File and Dir)?

adambeynon commented 11 years ago

rubyspec covers everything in corelib and stdlib, which is why opal doesn't try and use it directly. There is a load of stuff missing from stdlib. The most complete section are the Array and Hash classes. The number of missing/incomplete methods is very small.

The other problem with rubyspec is that it runs using mspec, which has 1 or 2 things not currently possible with opal, such as mutable strings and relying on a lot of stdlib stuff which opal doesn't implement yet.

Alternatively, it might be worth seeing what mruby uses as a test suite, as that is probably closer to what opal would be able to implement.

mrbrdo commented 11 years ago

mruby has its own specs in the mruby repo (there are a lot of comments about ISO in them if that tells you anything), they are written in ruby (well, mruby) - makes sense since you know how it is with Matz and Rubyspec... The specs are pretty light-weight at the moment (but basic functionality is tested). There are however some implementation specific specs, although I'm not sure to what extent (I remember there were a few specs labeled as bug #...). There is only around 500 examples though in all the specs combined. So compared to rubyspec (at the moment) it is much less thorough.

Ajedi32 commented 11 years ago

I'd like to point out that enabling method_missing probably isn't a huge performance killer. Unless you're trying to do something that's very JS intensive you probably won't be able to tell the difference. And, if you are doing that's very performance sensitive I think you probably wouldn't want to use Opal in the first place.

Personally, I'd prefer that Opal support the full Ruby spec by default, and then let people disable features like method_missing if they're really concerned about performance. Your call though.

adambeynon commented 11 years ago

As a very unscientific measure, the average of running the specs without method missing was 0.598 seconds on my machine. With method missing on, it went up to 0.623 seconds on average (20 runs in both modes). Of course there are lots of variables that can change the runnning time, but those benchmarks dont look too bad.

elia commented 11 years ago

Agree with @Ajedi32

I'd probably open an x-string if I need serious optimization (e.g. on scroll/resize events). Also I think stuff like File has a chance of being implemented in the future (e.g. on Node.js or via some html5 storage). I prefer rubyspec.

meh commented 11 years ago

As I already said at the beginning of this issue, compliancy > performance.

mrbrdo commented 11 years ago

Yeah I've also been saying all the time, it's more important to fully pass ruby specs as opposed to performance. You can have performance optimizations that break the language but they should be off by default. (sidenote: I personally would never use those optimizations at all)

adambeynon commented 11 years ago

Okay, just pushed two commits: 70d5613ea2 and 2e7d1a1089. These now enable opal to support method missing on all objects, classes, etc etc. method_missing is now turned on by default. The plan will be to allow files to enable/disable method_missing if they require, but, for now there is a global on/off switch on Opal::Processor.method_missing_enabled. This allows the easiest integration with the rails asset pipeline, or any standalone opal application. So, to turn off method_missing, just add this somewhere in an initializer:

Opal::Processor.method_missing_enabled = false

Next step: source maps. Which will make debugging opal code a little easier.

mrbrdo commented 11 years ago

Nice. I really feel that was the right decision. Good work.

meh commented 11 years ago

:+1:

adambeynon commented 11 years ago

Well, now that this has been implemented, time to close. opalrb.org doesn't have it yet (for the try now feature), but I will update it shortly.