Closed balupton closed 13 years ago
To transfer an object over the wire, one must first serialize the native object. If JSON is used, functions are not serialized. Recall that functions in JS are closures, so even if you transfer them as source text, they won't be very useful at the remote end.
@dvv For top-level parameters (not functions nested in an object) we serialize the functions such that they can be parsed and called from the other side as a remoteCall, this is how we support callbacks. The issue is that nested functions are not supported.
@ericz so you could do the same thing for nested functions right?
If it is decided to handle transfer of deep function: just FYI, in V8 typeof /iamregexp/ === 'function'
, so additional precausion should be taken when determining whether a value is truly callable.
@dvv wow, I had no idea. I'll def look out
You can use /a/ instanceof RegExp
, though I'd imagine regex's would get stripped out too.... As JSON.stringify({a:/a/})
returns {}
Sounds like deep functions will have to be handled just like functions passed as parameters (ie. callbacks).
Wonder in this case how would GC behave? What should it do with deep functions converted to callable contexts when the call is finished? Wouldn't it be a easy way to DOS the server? I base my opinion on now.js manual where they strongly emphasize that we should avoid where possible the increase of depth of hierarchy of objects we sync.
@dvv, the function passed is merely a reference so that now.js knows what function to call on the client side. So, as far as I understand, it is nothing different from eg. a string that was passed and it's size is independent from the logic/size of the actual function on the other side. This would mean that there is no special situation here regarding the GC.
The avoidance of deep objects is mostly because of variable syncing, I believe. When an object is deep, it has to be traversed in order to check whether the variable needs to be synced or not.
Anyway, one should refrain from resending complete states to the other side when only a certain element has changed. A more sophisticated model framework that only broadcasts necessary changes should be used, such as backbone.
"...function passed is merely a reference...": Well, I mean the case of anonymous functions which get recreated upon each call in the example code of the post. Where should go "old" references of which now.js knew during the previous call? IOW, wouldn't these reference just orphaned in the memory with no chance for GC to reap them? Sorry for probably falsy apriori considerations...
@dvv, I'm not sure how anonymous functions are handled. You may very well be right, guess we have to wait for Eric's or Sridatta's comment.
Slightly related is the issue that clients can cause excessive memory usage on the server side by passing in huge objects as arguments (I created an issue about this two weeks ago or so).
Yep this is a problem we have with callbacks. Since we have to keep a reference to them they never get GC'ed. Unfortunately there is no easy way around this because we can only assume that the other side may or may not want to call the callback at anytime.
If we implement support for deep functions in parameters to remoteCalls, we would simply store a reference to the function in the same hash as the closures on the origin, and then replace the function with a wrapper function that makes a remoteCall on the destination side.
The design wouldn't actually cause the entire object containing the function to be stored, just a reference to the function, so the object can get GC'ed though the same problem with the function not being GC'ed exists
@ericz, why not limit the life-time of a passed callback to the life-time of the function to which it was passed?
If one has a system where he needs to store client-side callbacks for later use, he should really rethink his strategy. Such behavior can easily be done by making the server call an actual client-side now.js function.
Edit: now that I think of it, the life-time of a function has no meaning since many operations are async.
To be honest, any objective accomplished by passing a callback can also be accomplished by calling a function on the remote end with the required information.
Keeping them always in memory can actually be a huge problem... allowCallbacks boolean anyone?
Looks like another case where simplicity simply cannot outrule security, especially if you are going to target for corporations.
I suspected that. Given closures can hold rather big amount of memory and inaccurate use of them can choke now.js, I personally wouldn't allow functions in arbitrary objects put on wire.
Expecting to be understood right, --Vladimir
@tommedema: "just calling a function on the remote end" would require to synchronize the map of calls to answers. How do you know that the remote end has changed your state as a result of call you just made. Imagine UDP vs. TCP, JSON-RPC with id key and alike.
Forgive me if this is uninformed, but I'm confused about why there are security and memory issues.
In regards to memory issues, why would it have a risk of DOS the server? I don't understand how with now.js and support for nesting makes DOS'ing any easier compared to say a lot of AJAX requests with nested data or simply any other type of DOS'ing attack.
In regards to security issues, why would there be an issue with allowing callbacks specified by the client to be triggered by the server side and run on the client side? The security issue would be if that callback is run on the server side, which is isn't.
On another note, wouldn't it be as simple as something like this:
var secureCallbacks = function(obj) {
// Check
if ( typeof item !== 'object' ) {
return item;
}
// Cycle
for ( var key in obj ) {
// Check
if ( obj.hasOwnProperty(key) ) {
return true;
}
// Prepare
var item = obj[key];
// Type
switch ( typeof item ) {
case 'object':
secureCallbacks(item);
break
case 'function':
// Function
if ( !(item instanceof RegExp) ) {
obj[key] = function(){
// Server Side
if ( now.isServerSide() ) {
// Security Breach
throw Error('Client Side created functions are not allowed to run on the Server Side');
}
// Client Side
else {
// Trigger Original Function
item.apply(obj,Array.prototype.slice.call(arguments));
}
};
}
break;
default:
break;
}
}
}
Then run that through your serialiser, then on server-side run it through your deserialiser.
A brief analysis reveals the code trapped into the famous JS gotchas: i) typeof null === 'object'
, ii) obj.hasOwnProperty
can be missing/overloaded, iii) obj
can be sealed/frozen
I suppose a bunch of AJAX/vanilla requests is of lower risk, since they are short-living and properly coded server house-keeps after each of them (plus one can throttle the rate by well-known means). WebSocket connection, once forged or blocked, holds the server memory forever, I guess.
@dvv,
For example, if a client calls registerNewAccount(name) on the server (an idiotic example but hey) and it'd have given a callback which is then called whenever the registration process finished, eg: callback(false, "Invalid name."); in case it was false. Instead of using that callback, the client could create a method: now.onRegistrationComplete(success, err) so that the server can simply call this method and does not need to know of any callbacks.
@balupton, the memory issue is what I mean with security vulnerability. It is different from normal AJAX calls in that now.js actually stores things in memory, which is entirely in control of the client. Thus, a client can cause an excessive amount of data to be stored in memory. This is a vulnerability.
All in all, this should be it, serialise, secure and deserialise: https://gist.github.com/940934
@tommedema: i hear you. Now let's image the client calls registerNewAccount(name)
ten times with a high rate. How do you rock-solid bind the calls to client's now.onRegistrationComplete(success, err)
to each of the initiated requests?
@balupton: mind to share a gist, so that we can discuss particular line numbers?
The actual function never has to be serialized since it can only be called on the same side at which it is created. A mere reference has to be serialized.
But since one cannot know when the callback has to be called I do not think this should be a feature at all.
@dvv, I guess you'd have to add an identifier, eg. now.onRegistrationComplete(name, success, err)
Gisted :) https://gist.github.com/940934
L15 -- do you mean item
? if you mean obj
, obj === null
is still allowed
L29 -- I'm sure you are aware and just forgot that dots is pretty valid character to have in hash keys ;)
How about we move to #nowjs on freenode :) It's becoming hard to track all the response :S
Alright.
So I've updated https://gist.github.com/940934
It supports:
this.complete()
within anonymous functionsSo really it solves the nested functions/regexs issue, and the garbage collection issue.
Any objections?
Apparently socket.io now supports these callback features as well: http://socket.io/index.html#announcement
How does socket.io handle the GC problem?
a hash of callbacks purged after callback has fired. no timeouts for dead callbacks so far
@dvv, thanks, I brought this up to their ML.
rechecked and saw at server there's no purging so far. we'd like the code to settle down a little
@dvv, what do you mean exactly? Callbacks are never GCed?
@tommedema @dvv @balupton
In 0.7 callbacks are GC'ed after a user configurable timeout (default is 30 seconds).
We decided not to enable nested functions because of the performance impact an the rare use case.
When a function is sent inside an object as an argument to the client it gets stripped away.
A use case is when clientWrite is false, it would be nice to send a object to the server that contains anything we would like to bind.