Flotype / now

NowJS makes it easy to build real-time web apps using JavaScript
http://www.nowjs.com
MIT License
1.91k stars 175 forks source link

Functions cannot be embedded in objects when sent from the client #71

Closed balupton closed 13 years ago

balupton commented 13 years ago

When a function is sent inside an object as an argument to the client it gets stripped away.

// Client
window.now.hiFromClient({
    a: 1,
    b: 'b',
    c: null,
    d: false,
    e: true,
    f: function(){},
    g: {
        a: 1,
        b: 'b',
        c: null,
        d: false,
        e: true,
        f: function(){},
        g: {
            a: 1,
            b: 'b',
            c: null,
            d: false,
            e: true,
            f: function(){}
        }
    }
});

// Server
everyone.now.hiFromClient = function(options){
    console.log(options);
}

// Log
{ a: 1,
  b: 'b',
  c: null,
  d: false,
  e: true,
  g: 
   { a: 1,
     b: 'b',
     c: null,
     d: false,
     e: true,
     g: { a: 1, b: 'b', c: null, d: false, e: true } } }

A use case is when clientWrite is false, it would be nice to send a object to the server that contains anything we would like to bind.

dvv commented 13 years ago

To transfer an object over the wire, one must first serialize the native object. If JSON is used, functions are not serialized. Recall that functions in JS are closures, so even if you transfer them as source text, they won't be very useful at the remote end.

ericz commented 13 years ago

@dvv For top-level parameters (not functions nested in an object) we serialize the functions such that they can be parsed and called from the other side as a remoteCall, this is how we support callbacks. The issue is that nested functions are not supported.

balupton commented 13 years ago

@ericz so you could do the same thing for nested functions right?

dvv commented 13 years ago

If it is decided to handle transfer of deep function: just FYI, in V8 typeof /iamregexp/ === 'function', so additional precausion should be taken when determining whether a value is truly callable.

ericz commented 13 years ago

@dvv wow, I had no idea. I'll def look out

balupton commented 13 years ago

You can use /a/ instanceof RegExp, though I'd imagine regex's would get stripped out too.... As JSON.stringify({a:/a/}) returns {}

tommedema commented 13 years ago

Sounds like deep functions will have to be handled just like functions passed as parameters (ie. callbacks).

dvv commented 13 years ago

Wonder in this case how would GC behave? What should it do with deep functions converted to callable contexts when the call is finished? Wouldn't it be a easy way to DOS the server? I base my opinion on now.js manual where they strongly emphasize that we should avoid where possible the increase of depth of hierarchy of objects we sync.

tommedema commented 13 years ago

@dvv, the function passed is merely a reference so that now.js knows what function to call on the client side. So, as far as I understand, it is nothing different from eg. a string that was passed and it's size is independent from the logic/size of the actual function on the other side. This would mean that there is no special situation here regarding the GC.

The avoidance of deep objects is mostly because of variable syncing, I believe. When an object is deep, it has to be traversed in order to check whether the variable needs to be synced or not.

Anyway, one should refrain from resending complete states to the other side when only a certain element has changed. A more sophisticated model framework that only broadcasts necessary changes should be used, such as backbone.

dvv commented 13 years ago

"...function passed is merely a reference...": Well, I mean the case of anonymous functions which get recreated upon each call in the example code of the post. Where should go "old" references of which now.js knew during the previous call? IOW, wouldn't these reference just orphaned in the memory with no chance for GC to reap them? Sorry for probably falsy apriori considerations...

tommedema commented 13 years ago

@dvv, I'm not sure how anonymous functions are handled. You may very well be right, guess we have to wait for Eric's or Sridatta's comment.

Slightly related is the issue that clients can cause excessive memory usage on the server side by passing in huge objects as arguments (I created an issue about this two weeks ago or so).

ericz commented 13 years ago

Yep this is a problem we have with callbacks. Since we have to keep a reference to them they never get GC'ed. Unfortunately there is no easy way around this because we can only assume that the other side may or may not want to call the callback at anytime.

If we implement support for deep functions in parameters to remoteCalls, we would simply store a reference to the function in the same hash as the closures on the origin, and then replace the function with a wrapper function that makes a remoteCall on the destination side.

The design wouldn't actually cause the entire object containing the function to be stored, just a reference to the function, so the object can get GC'ed though the same problem with the function not being GC'ed exists

tommedema commented 13 years ago

@ericz, why not limit the life-time of a passed callback to the life-time of the function to which it was passed?

If one has a system where he needs to store client-side callbacks for later use, he should really rethink his strategy. Such behavior can easily be done by making the server call an actual client-side now.js function.

Edit: now that I think of it, the life-time of a function has no meaning since many operations are async.

To be honest, any objective accomplished by passing a callback can also be accomplished by calling a function on the remote end with the required information.

Keeping them always in memory can actually be a huge problem... allowCallbacks boolean anyone?

Looks like another case where simplicity simply cannot outrule security, especially if you are going to target for corporations.

dvv commented 13 years ago

I suspected that. Given closures can hold rather big amount of memory and inaccurate use of them can choke now.js, I personally wouldn't allow functions in arbitrary objects put on wire.

Expecting to be understood right, --Vladimir

dvv commented 13 years ago

@tommedema: "just calling a function on the remote end" would require to synchronize the map of calls to answers. How do you know that the remote end has changed your state as a result of call you just made. Imagine UDP vs. TCP, JSON-RPC with id key and alike.

balupton commented 13 years ago

Forgive me if this is uninformed, but I'm confused about why there are security and memory issues.

In regards to memory issues, why would it have a risk of DOS the server? I don't understand how with now.js and support for nesting makes DOS'ing any easier compared to say a lot of AJAX requests with nested data or simply any other type of DOS'ing attack.

In regards to security issues, why would there be an issue with allowing callbacks specified by the client to be triggered by the server side and run on the client side? The security issue would be if that callback is run on the server side, which is isn't.

On another note, wouldn't it be as simple as something like this:

var secureCallbacks = function(obj) {
    // Check
    if ( typeof item !== 'object' ) {
        return item;
    }

    // Cycle
    for ( var key in obj ) {
        // Check
        if ( obj.hasOwnProperty(key) ) {
            return true;
        }

        // Prepare
        var item = obj[key];

        // Type
        switch ( typeof item ) {
            case 'object':
                secureCallbacks(item);
                break

            case 'function':
                // Function
                if ( !(item instanceof RegExp) ) {
                    obj[key] = function(){
                        // Server Side
                        if ( now.isServerSide() ) {
                            // Security Breach
                            throw Error('Client Side created functions are not allowed to run on the Server Side');
                        }
                        // Client Side
                        else {
                            // Trigger Original Function
                            item.apply(obj,Array.prototype.slice.call(arguments));
                        }
                    };
                }
                break;

            default:
                break;
        }
    }
}

Then run that through your serialiser, then on server-side run it through your deserialiser.

dvv commented 13 years ago

A brief analysis reveals the code trapped into the famous JS gotchas: i) typeof null === 'object', ii) obj.hasOwnProperty can be missing/overloaded, iii) obj can be sealed/frozen

dvv commented 13 years ago

I suppose a bunch of AJAX/vanilla requests is of lower risk, since they are short-living and properly coded server house-keeps after each of them (plus one can throttle the rate by well-known means). WebSocket connection, once forged or blocked, holds the server memory forever, I guess.

tommedema commented 13 years ago

@dvv,

For example, if a client calls registerNewAccount(name) on the server (an idiotic example but hey) and it'd have given a callback which is then called whenever the registration process finished, eg: callback(false, "Invalid name."); in case it was false. Instead of using that callback, the client could create a method: now.onRegistrationComplete(success, err) so that the server can simply call this method and does not need to know of any callbacks.

@balupton, the memory issue is what I mean with security vulnerability. It is different from normal AJAX calls in that now.js actually stores things in memory, which is entirely in control of the client. Thus, a client can cause an excessive amount of data to be stored in memory. This is a vulnerability.

balupton commented 13 years ago

All in all, this should be it, serialise, secure and deserialise: https://gist.github.com/940934

dvv commented 13 years ago

@tommedema: i hear you. Now let's image the client calls registerNewAccount(name) ten times with a high rate. How do you rock-solid bind the calls to client's now.onRegistrationComplete(success, err) to each of the initiated requests?

dvv commented 13 years ago

@balupton: mind to share a gist, so that we can discuss particular line numbers?

tommedema commented 13 years ago

The actual function never has to be serialized since it can only be called on the same side at which it is created. A mere reference has to be serialized.

But since one cannot know when the callback has to be called I do not think this should be a feature at all.

@dvv, I guess you'd have to add an identifier, eg. now.onRegistrationComplete(name, success, err)

balupton commented 13 years ago

Gisted :) https://gist.github.com/940934

dvv commented 13 years ago

L15 -- do you mean item? if you mean obj, obj === null is still allowed

dvv commented 13 years ago

L29 -- I'm sure you are aware and just forgot that dots is pretty valid character to have in hash keys ;)

balupton commented 13 years ago

How about we move to #nowjs on freenode :) It's becoming hard to track all the response :S

balupton commented 13 years ago

Alright.

So I've updated https://gist.github.com/940934

It supports:

So really it solves the nested functions/regexs issue, and the garbage collection issue.

Any objections?

tommedema commented 13 years ago

Apparently socket.io now supports these callback features as well: http://socket.io/index.html#announcement

How does socket.io handle the GC problem?

dvv commented 13 years ago

a hash of callbacks purged after callback has fired. no timeouts for dead callbacks so far

tommedema commented 13 years ago

@dvv, thanks, I brought this up to their ML.

dvv commented 13 years ago

rechecked and saw at server there's no purging so far. we'd like the code to settle down a little

tommedema commented 13 years ago

@dvv, what do you mean exactly? Callbacks are never GCed?

ericz commented 13 years ago

@tommedema @dvv @balupton

In 0.7 callbacks are GC'ed after a user configurable timeout (default is 30 seconds).

We decided not to enable nested functions because of the performance impact an the rare use case.