fholm / IronJS

IronJS - A JavaScript implementation for .NET
http://ironjs.wordpress.com
Apache License 2.0
680 stars 79 forks source link

Seamless .NET integration syntax #28

Open fholm opened 13 years ago

fholm commented 13 years ago

The syntax for consuming .NET namespaces and classes inside of JavaScript source code needs to be picked.

Current Suggestion

We support both early (compile) and late (runtime) bindings of .NET integration, we do so because early bound gives major performance benefits but also requires a syntax extension to the JavaScript language while late bound requires no extension and is more flexible as .NET types can be treated as JS objects but is also slower.

The current plan is to implement the late bound method first, because it requires no changes to the compiler.

Early bound

Early bound extesions introduces a new compiler switch in the same way that ES5 introduced "use strict";. This forces the users to make a concious descision about using early bound CLR integration since it introduces a new, contextual, keyword called clr which is put infront of the new keyword to allow the compiler to do an early bound call to the class constructor.

"use clr-early";

import System
import System.Collections.Generic

var lst = clr new List<String>()
var clr = true; // clr is contextual, just a keyword infront of "new"

Late bound

Late bound makes use of the commonjs-esque require function which creates a native JS object which wraps the import namespace and it's types as properties on this object.

var system = require("clr ..."); // whatever syntax we end up using there
var collections = require("clr ...");

var lst = new collections.List([system.String], [])
slide commented 13 years ago

why not just use the require syntax?

var System = require('System');

System.Console.WriteLine('foo');

otac0n commented 13 years ago

We could optionally just include System in the global namespace, and support the new keyword as-is for non-generics.

We could then use a different symbol for type parameters (maybe backticks?) so that we don't have a collision with legitimate ES3 code.

sgoguen commented 13 years ago

Why not use the with keyword in conjunction with an import function? Everybody loves the with keyword! :)

fholm commented 13 years ago

@slide The problem with just the specific require keyword is that it requires different semantics depending on if we're importing a commonjs-module (node.js-style) or a .NET namespace.

The reason I want to use a new keyword is that it would allow for a lot of the logic to happen at compile time instead of runtime. using "clr import" and "clr new" allows us to resolve all the classes and imports at compile time instead of doing it dynamically at runtime.

slide commented 13 years ago

@fholm I agree with the speed benefit of a new syntax, but "clr import" and "clr new" require more typing every time you want to do something with the .NET framework classes. "Seamless" to me means that you don't need to worry about whether you are dealing with a JavaScript class, or a .NET class, they should both work the same way.

sgoguen commented 13 years ago

If I could use the plain old new keyword to create a List after importing System.Collections.Generic with "clr import", I would be in favor of the clr import. Maybe you can call it "clr-import" or "clrimport" so clr doesn't become a keyword.

ChaosPandion commented 13 years ago

What if when you import a generic object you generate a proxy object with a special override of member access syntax.

var ls = new List[System.String]();

This compiler, which would know that the name 'List' is currently mapped to a generic object, would resolve the type name specified.

fholm commented 13 years ago

@slide I agree that forcing the "clr" keyword is not seamless, but the benefits would outweigh that drawback. And I also like having a clear distinction of when you're using CLR vs. JavaScript objects/functions.

@sgoguen I would prefer to have a clr new keyword also because it allows a whole bunch of compile time optimizations. Or maybe "clrnew" and "clropen" instead of "clr new" and "clr import/open"

@ChaosPandion That will not work because there is no way to reliably determine the value of a javascript object untill runtime, take this example:

import System.Collections.Generic
eval("List = function() { ... }");
var ls = new List[System.String]();

It's impossible to guarantee that List will be a CLR List at compile time because new can refer to both a CLR object and a JS object, while if you were to use clr new or clrnew it would give the CLR functions their own "namespace" at compile time.

jhugard commented 13 years ago

Ecma-262 v5 reserves the "import" keyword. I'd suggest sticking close to the language designer's future intentions to avoid compatibility issues down the road. I'm not suggesting Microsoft did a perfect job with JScrpt.NET, but since they worked and continue to work with the ECMA, and had a staff of people working on this full time, it might be worth leveraging the thought they put into the problems of CLR integration.

http://msdn.microsoft.com/en-us/library/eydzeybh(v=VS.71).aspx

jhugard commented 13 years ago

Sounds like you are looking for an early binding (compile time) optimization for .NET types. How about "clrnew", but also use "import" per the reserved keyword? However, it might be a good idea to also support late binding to CLR types via a vanilla "new" on the same types supported by "clrnew".

slide commented 13 years ago

Would it be possible to also support runtime binding with the normal ways (require, new, etc)?

fholm commented 13 years ago

The problem with late binding and the new keyword is the parsing problem I highlighted with ambiguity on parsing new List

ChaosPandion commented 13 years ago

@fholm - Hmm, I may not have described my idea appropriately. What I meant to say was that when you import a generic type the identifier List will resolve to a special runtime type that acts as a constructor factory for a lack of a better term.

The following statement:

 var is = new List[System.Int32]()

would be transformed to this:

 var is = new List["System.Int32"]()

The expression:

List["System.Int32"]

would resolve to a cached constructor object.

otac0n commented 13 years ago

@jhugard I don't think that it is a good Idea to "pollute" the import keyword with our own interpretation. If future versions of ECMAScript define it differently than we do, we could be in a world of hurt, trying to update IronJS without breaking our user's scripts.

otac0n commented 13 years ago

I guess that my voite is for the contextual version of the "clr" keyword:

valid: var foo = clr new System.StringBuilder()

also valid: var clr = true;

jhugard commented 13 years ago

@otac0n I see your point, but if the official spec introduces a similar feature it would result in either a dual implementation or a breaking change anyway. Personally, I'd rather see a breaking change and dealt with it over having divirgent languages.

IMHO, import as described in the MS JScript.NET spec is very natural and unlikely to be specified differently than "import namespace". Plus, JScript.NET provides an existing implementation of this feature, adding it some weight.

With regard to clrnew, it is introducing a new language feature: early binding. I'm all in favor of this as a performance enhancement and see the benefits, but if it is the only way to cons clr objects it means my code will need to be aware of what kind of object it is dealing with and I won't be able to replace one with the other transparently. Given the choice, my preference here would still be to treat clr and jscript constructors as the same, even if late bound, and consider clrnew as an (optional) enhancement.

Could the parsing issue be resolved by disallowing less-than while parsing a new statement? I.E., disallow "new Number< 5", but instead require "(new Number) < 5"? Or, by requireing a space for less-than and no-space for type, making "new Number < 5" and "new List<System.Int32>" legal, but not "new Number<5" and "new List <System.Int32>"? C++ solved a very similar problem in this way when templates were introduced.

fholm commented 13 years ago

My current opinion is this, since there are large use cases for both late and early bound CLR integration, this is how would like to solve it: Supprt both.

For the early bound case we would introduce a "compiler switch" in the same way that strict mode does "use strict"; so that you have to intentionally turn it on since it's an extesion to the JavaScript syntax.

"use clr-early";

import System
import System.Collections.Generic

var lst = clr new List<String>();

We will use the contextual clr keyword infront of new, allowing the compiler to early bind these calls by only looking in the imported namespaces.

For late binding I would suppose we do this: use the commonjs require function and prefix the require-path with clr:, and then do what @jhugard suggested and have it build a javascript constructor function for each imported type, something like this:

var system = require("clr:System");
var collections = require("clr:System.Collections.Generic");

var lst = new collections.List([system.String], [])

The first argument to the constructor function are the, if any, type parameters need and the second argument is the constructor arguments that will get passed in. The reason I like this is that it doesn't implicitly import the classes straight into the global namespace which makes them show up in every .js file running in that execution context, which easily could lead to serious problems if you require two different namespaces containing the same class name for example.

slide commented 13 years ago

I really like this approach!

ChaosPandion commented 13 years ago

@fholm - I like your late binding idea. My idea would of course not support multiple type definitions so the array parameter to the special constructor makes sense.

otac0n commented 13 years ago

If we are going to use a URL prefix, I would suggest we go with the precedent set down by Microsoft: clr-namespace:SDKSample;assembly=SDKSampleLibrary

This is for XAML namespaces, but I think that it would work well for our purposes, as well.

slide commented 13 years ago

@otac0n, please no! that syntax is terrible.

otac0n commented 13 years ago

@slide; it is the same as suggested above, except for the "-namespace" addition...

fholm commented 13 years ago

@otac0n: I'm not a huge fan of the ;assembly=Foo thing, I my thought was that we would use a method on the context object that is called .ExposeAssembly('System.Core'); which enables javascript access to it's namespaces through import and the require functions.

otac0n commented 13 years ago

@fholm: Well, that doesn't let the scripts stand on their own... The hosting application would have to know which assemblies were desired beforehand. Of course, there may be security concerns with arbitrarily loading assemblies, but I think that setting the trust of the AppDomain to low or medium would be enough to allow this to be done safely.

fholm commented 13 years ago

@otac0n: Hm, very good point.

I think this is splitting into two different topics/issues right now, one concerning the syntax and one concerning the security implications of having access to CLR objects from the hosting environment.

But, just regarding the syntax everyone seems to be ok with the two following ones:

Early bound

"use clr-early";

import System
import System.Collections.Generic

var lst = clr new List<String>()

Late bound

var system = require("clr ..."); // whatever syntax we end up using there
var collections = require("clr ...");

var lst = new collections.List([system.String], [])
fholm commented 13 years ago

Update the issue with the latest proposal

jhugard commented 13 years ago

How will this impact objects injected into script global namespace from the host application?

We currently have quite a few COM objects we inject. If we move to IronJS, then we would replace those over time with .NET objects. At the moment, ALL of our COM objects are object factories which are called from inside native JavaScript constructors and either returned from that constructor as the "this" object or assigned as a private member to "this". Over time, I expect we would dispense with the factories and migrate the JavaScript wrappers over to .NET as well.

cstrahan commented 13 years ago

@fholm: I can see one potential problem with the proposal for the late bound syntax: ambiguity.

// This works fine for MyList<T>...
var lst = new myCollections.MyList([system.String], []);

// ... but what about a non generic list?
var size = 10;
var lst = new myCollections.MyList(size);

I can't think of any good way to reliably disambiguate those two cases. Is there something that I'm missing?

Perhaps something like the following alternative would work (inspired by IronRuby):

var system = require("clr ..."); // whatever syntax we end up using there
var collections = require("clr ...");

var genericList = new collections.List.of(system.String)(arg1, arg2, ...)
var nonGenericList = new collections.List(arg1, arg2, ...)

EDIT: Yes, I suppose you could provide an empty type parameter list to specify a non generic type, but that would be cumbersome.

otac0n commented 13 years ago

Ah, before we go to far implementing this, I think we should familiarize ourselves with the way that IronPython does types and generic types: http://ironpython.net/documentation/dotnet/dotnet.html

einaregilsson commented 13 years ago

Hi, first time caller, long time listener :)

Why not use the late bound syntax everywhere so the syntax of Javascript doesn't have to be changed, but make the "use clr-early" statement change the behaviour so that built-in .NET types can't be redefined?

Example:

var collections = require("clr:System.Collections.Generic");
var lst = new collections.List([system.String], [])
collections.List = 5; //<--Error, you can't redefine List

Actually, this would just require that the Type objects are readonly properties on the imported namespace object.

Another possibility:

var collections = require("clr:System.Collections.Generic");
collections = 5; <--Error, you can't reassign a namespace variable (perhaps use the const keyword? Don't know the support in other js engines)

Or there are other possibilites, but basically the idea is to use the same syntax but make the "use clr-early" switch control whether certain operations are illegal. So in "use clr-early" mode you would know that if you have var collection = require("clr:...") you know that it will never change and you can early bind to it.

The benefits as I see them:

  1. The standard Javascript syntax doesn't change
  2. You can write one script and then decide which mode to use. So if you've written a long script, and decide to use clr-early you don't have to go and change all your "new" statements, just add the switch.
treenewlyn commented 13 years ago

why not both support?

asbjornu commented 13 years ago

Instead of implementing import before it's specified how the keyword is going to work in ECMA-262, I'd suggest that using a known and proven way to do import in JavaScript is the way to go. Node.js' require() function is predictable, known and not a part of ECMA-262, so its behavior won't change with new editions of the specification.

If we implement require() now, it will continue to work like it's implemented for the foreseeable future, but if we go with import, we might need to change the behavior down the line when it's defined in ES6 or whatever. I don't quite understand why handling require('System') would need to be deferred until runtime.

The implementation will of course be more complicated by having to investigate what's being required, but as long as the string passed to the function is a constant, this can be done in the compiler, no?

crpietschmann commented 13 years ago

I encourage you guys to look at the CommonJS specifications before implementing either "require" or "import" methods.

http://wiki.commonjs.org/wiki/Modules/1.1

I like how they've outlined the "require" method functionality to eliminate scripts injecting into the Global namespace if you don't want them to. Allowing any script file to modify the Global namespace can be dangerous, unless you are fully aware of how it's being modified and that none of the scripts will interfere with eachother.

I believe this is the specification that Note.js implements.