microsoft / TypeScript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
https://www.typescriptlang.org
Apache License 2.0
101.05k stars 12.49k forks source link

Partial modules and output optimization #447

Closed Davidhanson90 closed 2 years ago

Davidhanson90 commented 10 years ago

We have a typescript application and a use a file per class. There are lots of benefits to this the major one being source control conflict reduction.

We compile our app using the --out compiler flag but I think there is room for some serious optimization of the compilers output for modules.

Currently if you have two classes defined in separate files the compile generates the following JavaScript.

var Example;
(function (Example) {
    (function (Nested) {
        (function (Issue) {
            var bar = (function () {
                function bar() {
                }
                return bar;
            })();
            Issue.bar = bar;
        })(Nested.Issue || (Nested.Issue = {}));
        var Issue = Nested.Issue;
    })(Example.Nested || (Example.Nested = {}));
    var Nested = Example.Nested;
})(Example || (Example = {}));

var Example;
(function (Example) {
    (function (Nested) {
        (function (Issue) {
            var foo = (function () {
                function foo() {
                }
                return foo;
            })();
            Issue.foo = foo;
        })(Nested.Issue || (Nested.Issue = {}));
        var Issue = Nested.Issue;
    })(Example.Nested || (Example.Nested = {}));
    var Nested = Example.Nested;
})(Example || (Example = {}));

I suggest a compiler flag that composes all classes/ functions etc under a single module declaration? So the output code would be like the example below. This would dramatically reduce the size of the JavaScript the compiler outputs. I estimate that our app which currently compiles to around 15,000 lines of probably be only be 2000 lines if this were available.

var Example;
(function (Example) {
    (function (Nested) {
        (function (Issue) {
            var bar = (function () {
                function bar() {
                }
                return bar;
            })();
            Issue.bar = bar;
            var foo = (function () {
                function foo() {
                }
                return foo;
            })();
            Issue.foo = foo;
        })(Nested.Issue || (Nested.Issue = {}));
        var Issue = Nested.Issue;
    })(Example.Nested || (Example.Nested = {}));
    var Nested = Example.Nested;
})(Example || (Example = {}));
RyanCavanaugh commented 10 years ago

I estimate that our app which currently compiles to around 15,000 lines of probably be only be 2000 lines if this were available.

I want to make sure I'm understanding the magnitude of your scenario based on these numbers -- if you're saving 10 lines of overhead (a three-level deep module) per input .ts file, that means you have (15,000-2,000) / 10 = 1,300 files. If your output would only be 2,000 lines, that means the average input file is contributing 1.5 lines of code to the output? That seems impossible since even an empty class is 5 lines. What's your breakdown like?

Davidhanson90 commented 10 years ago

So we have about 8 projects each represents a module (I'm talking ui module here not js ones etc)

A ui module groups all its dependencies under a TS module. They follow a strict structure.

Module.Name.category.SubCategory so for one of our modules you would find folders which represent a category and subcategories. Some examples

Modules.Foo.Filters Modules.Foo.Data.Entities Modules.Foo.Data.Services Modules.Foo.Data.Mappers

Each of these folders can have alot of files with small classes or functions.

As I said we have 8 Ui modules being built, these are built by different teams and they share some core libraries as well so

Core.Framework.Categories.Subcategories Core.Presentation.Categories.Subcategories

So overall a consistent hierarchically namespaces app.

But if you add up all the bloat for the module pattern being repeated per file it makes for very fat output.

RamIdeas commented 10 years ago

I once wished for this, but after thinking about it, I realised it would create the possibility of leaked private variables. Consider:

module ns1.ns2 {
    var x = 'used by Class A';
    export class A {
        // ...
    }
}
module ns1.ns2 {
    var x = 'used by Class B';
    export class B {
        // ...
    }
}

By your proposal, merging these two module definitions into one closure for output would share (and overwrite) the variable x between the 2 classes.

var ns1;
(function (ns1) {
    (function (ns2) {
        var x = 'used by Class A';
        var A = (function () {
            function A() {
            }
            return A;
        })();
        ns2.A = A;
        var x = 'used by Class B';
        var B = (function () {
            function B() {
            }
            return B;
        })();
        ns2.B = B;
    })(ns1.ns2 || (ns1.ns2 = {}));
    var ns2 = ns1.ns2;
})(ns1 || (ns1 = {}));

A workaround to allow the modules to be merged is to wrap each initial module's output with an IIFE:

var ns1;
(function (ns1) {
    (function (ns2) {
        (function() {
            var x = 'used by Class A';
            var A = (function () {
                function A() {
                }
                return A;
            })();
            ns2.A = A;
        })();
        (function() {
            var x = 'used by Class B';
            var B = (function () {
                function B() {
                }
                return B;
            })();
            ns2.B = B;
        })();
    })(ns1.ns2 || (ns1.ns2 = {}));
    var ns2 = ns1.ns2;
})(ns1 || (ns1 = {}));

However, I proposed a more concise syntax on codeplex which would create one IIFE per module declaration.

var ns1;
(function (ns1, ns2) {
    var x = 'used by Class A';
    var A = (function () {
        function A() {
        }
        return A;
    })();
    ns2.A = A;
})(ns1 || (ns1 = {}), ns1.ns2 || (ns1.ns2 = {}));

var ns1;
(function (ns1, ns2) {
    var x = 'used by Class B';
    var B = (function () {
        function B() {
        }
        return B;
    })();
    ns2.B = B;
})(ns1 || (ns1 = {}), ns1.ns2 || (ns1.ns2 = {}));

Note: There was a bug with my code on codeplex. For a third namespace, the last line would be:

})(ns1 || (ns1 = {}), ns1.ns2 || (ns1.ns2 = {}), ns1.ns2.ns3 || (ns1.ns2.ns3 = {}));

This could be shrunk further by offering a compiler flag stating that all namespaces have been predefined, manually.

So in one file (which needs to precede all files using modules) I could have:

var ns1;
ns1 = {}
ns1.ns2 = {}
ns1.ns2.ns3 = {}

Then the output of each module could be:

(function (ns1, ns2, ns3) {
    // ...
})(ns1, ns1.ns2, ns1.ns2.ns3);

For what it's worth, I wrote a script that takes a JavaScript file output by tsc.exe and transforms it into cleaner JavaScript. It was originally written in JavaScript, but then I ported it into .NET to use as an IItemTransform. IItemTransforms only get applied when bundling is turned on, which meant minification was turned on. If I removed the minification transform, I could see my neatly generated javascript, even though it was all in one big file. Removing the closures, pointed me towards realigning the indentation of each line to something more reasonable, and made debugging much more pleasant (much more so than sourcemaps which reduce the functionality of Chrome's devtools dramatically).

It would be much more useful to have tsc output a nicer js file. Then I wouldn't have to rely on bundling.

I was, like you, also under the impression that it would at least half the size of my production javascript but after minification, it barely made a difference of a couple kilobytes. This is more an issue of cleaner code (closer on par to the one-to-one TypeScript-to-JavaScript ethos) with a slight efficiency boost.

Davidhanson90 commented 10 years ago

Regarding variable conflicts I think this could be picked up easily by the compiler as either an error or a warning.

I do really like the suggestion for cleaner module pattern using IIEF.

RyanCavanaugh commented 10 years ago

It's not totally clear what should qualify as a conflict, even:

module m {
    var x = { n: 3 };
}
module m {
    module x {
        export var n;
    }
    console.log(x.n);
}

Maybe if x were { m: 3 }, that wouldn't be a conflict?

I don't think this would be worthwhile if, in practice, most merged modules ended up creating a conflict. Seeing the effects on a larger codebase would be a good exercise; performing this optimization post-build with a parser wouldn't be too difficult.

Davidhanson90 commented 10 years ago

I'd argue that should raise a duplicate identifier error.

Perhaps the use of the word "partial" on modules would be a good fit for this. That way existing code bases would be unaffected and we apply a new rule set in the case of partial modules.

charlessolar commented 10 years ago

Another application for a change like this is for merging AMD file namespaces.

I am really disappointed with typescript's performance here - my example is I have a namespace call it Demo.Library with 2 classes, Foo and Bar

Foo and Bar are each their own file but they both export module Demo.Library - what I WANT to happen is for Foo.ts and Bar.ts to be merged into 1 file, call it Library.js so in my app I can require('Library') and get both.

As it stands now, we have to do require('Foo') and require('Bar') and THEN the Demo.Library namespace exists under the import name. See http://www.typescriptlang.org/Handbook#modules-pitfalls-of-modules

What I propose is a way to notify typescript to merge Foo and Bar similarly to the sample on that site which puts the definitions in 1 file. For large scale apps this is definitely a need, I can't put an entire module into 1 file.

danquirk commented 10 years ago

@volak you're just asking for multi file external modules yes?

charlessolar commented 10 years ago

@danquirk Not exactly - because I know external modules are meant to be 1 file in a folder structure. I'm asking for a way to define 1 external module over multiple files.

danquirk commented 10 years ago

yes, that's what I meant, just making sure we're on the same page, see #17

Davidhanson90 commented 9 years ago

I have a good example of where partial would be useful.

https://github.com/Microsoft/TypeScript/blob/master/src/compiler/parser.ts

**\ Comment moved from Partial classes issue as not appropriate for that thread.

seshualluvada commented 8 years ago

It all started when I was trying to define multiple classes under a namespace across multiple files (one class in each file) and I learnt that namespace level variables had to be exported in order to gain accessibility to a namespace level variable in a different file than what it has been defined. Please see #6457 . A big problem with having to export is that developers are forced to expose internal variables to code out side the namespace (just because the namespace is defined across multiple files), which they would never want to do. It basically brings us to choose between either exposing internal namespace variables to the general public or struggle to maintain "all" the name space related code in a single file, where neither of them is a desirable development experience.

Here are my thoughts on one approach to handle this.

  1. Add a compiler option --aggregateNamespaceDefinitions. What this would do is that before the actual compilation process, it would pull out namespace related code from all of the ts files and creates either one single ts file for entire project or multiple ts files based on the toplevel name of the namespace. The compiler option aggregateNamespaceDefinitions can take values "single" or "toplevel" to emit this behavior.
  2. Reorganize the type definitions in the namespace so that the types are defined in the same order as one would have coded if they defined the entire name space in one file.
  3. If there are variable name conflicts / type name conflicts, where the same variable / type is defined in multiple files at the same namespace level, error out the compilation process.
  4. If everything is fine, run the core compiler using these re-authored files than the original files.
  5. If the aggregateNamespaceDefinitions flag is not present, the emit behavior would be the same as it is currently.

This approach would take care of both the javascript bloat where the name space is defined several times in the compiled javascript code when a namespace was defined across multiple files as well as having to export internal module variables and making them visible to code outside the namespace when using namespace level variables across files.

In my line of business, it is primarily enterprise mobile devices that will be consuming javascript code, and so I have to take every step to eliminate the javascript bloat to optimize load performance.

Below is an example of what I was describing.

file1.ts

namespace Parent.Level1.Level2{
    var level2InstanceCount: number = 0;
    class Level2Child1{
        constructor(){
            level2InstanceCount += 1;
        }
    }
}

file2.ts

namespace Parent.Level1.Level2{
    class Level2Child2{
        constructor(){
            level2InstanceCount += 1;
        }
    }
}

file3.ts

namespace Parent.Level1{
    var level1InstanceCount: number = 0;
    class Level1Child1{
        constructor(){
            level1InstanceCount += 1;
        }
    }
}

file4.ts

namespace Parent.Level1{
    class Level1Child2{
        constructor(){
            level1InstanceCount += 1;
        }
    }
}

The re-authored TS file would be like below.

Parent.ts

namespace Parent {
    namespace Level1 {
        var level1InstanceCount: number = 0;
        class Level1Child1 {
            constructor() {
                level1InstanceCount += 1;
            }
        }
        class Level1Child2 {
            constructor() {
                level1InstanceCount += 1;
            }
        }
        namespace Level2 {
            var level2InstanceCount: number = 0;
            class Level2Child1 {
                constructor() {
                    level2InstanceCount += 1;
                }
            }
            class Level2Child2 {
                constructor() {
                    level2InstanceCount += 1;
                }
            }
        }
    }
}

While the re-authoring can be done using a pre-build task too, it would be very neat if the typescript compiler can do that.

gulbanana commented 8 years ago

Would a PR adding this feature be considered by the team? (e.g. is it worthwhile for me to write one)

fis-cz commented 7 years ago

So, how does i look wit review of this PR?

fis-cz commented 7 years ago

From my perspective, the typescript compiler should do combine same namespaces to the same IIFE, even if its splitted over multiple files. Otherwise its not the same namespace. And its confusing as in real, the 'namespace' keyword is not TypeScript namespace, but JavaScript namespace. Moreover, same named modules splitted over multiple files should be also combined to single output js file to be possible to split AMD, CommonJS or SystemJS module to mutiple files.

I know you are trying to keep as close as possible to ES spec and there is no such possibility to split anything over multiple files but it will be really helpful.

Consider we have a project with the following folder structure:

program
   a
      Ns_a1
         ClassA.ts
         ClassB.ts
      Ns_a1.Ns_a12
         ClassA.ts
      module_exports.ts
   b
      Ns_b1
         tools.ts
      module_exports.ts
   blah.ts
   tsconfig.json

a and b are modules, Ns are namespaces. Export file (module_exports.ts) should be possible to be used as is should be simply possible to define and find out what we are exporting out of the module without need of looking to all files in the module folder structure. Of course, export/reexport of the class, function, namespace or whatever else should be possible from any file in the same module scope, but will be messy in big projects.

The compilation output should be a.js , b.js and blah.js according to definition in files:

module_exports.ts

module a {
   // reexport interface and class B form namespace Ns_a1
   export Ns_a1.I;
   export Ns_a1.B;

   // export whatever else from the module
   export function something() {
   }

   export const blah: string = "blah";

   // of course, it should be possible to export the whole namespace
   export Ns_a1;
}

ClassA.ts:

module a {
   namespace Ns_a1 {
      /* partial */ class A { 
         // class code here, class is not exported so not visible out of Ns_a1
         // for partial proposal see ClassB.ts
      }
   }
}

ClassB.ts:

module a {
   namespace Ns_a1 {

      /* partial */ class A {
         // class merge or Duplicate identifier error
         // I would recommend considering of adopting the "partial" keyword from C#
         // if it would be specified, merge, if not, Duplicate identifier error
      }

      // under normal circumstances, i would place I to separate file but just to shorten the post
      export interface I {
         // exported interface so visible inside module a
      }

      export class B extends A implements I {
         // class code here, class is exported so visible as Ns_a1 inside module a
      }
   }
}

tools.ts

// tools does not define module so nothing to be merged with and works as in the current TS, so
// compiled to separate file (or considered as separate file in case of --out)
namespace Ns_a1 {
   // can't be merged as tools.js is separate module by nature (as it works now)
}

I will not write the code for the b folder branch as it is hopefully clear from previously mentioned code.

Please note the module keyword is supposed to be a AMD, UMD, CommonJS or SystemJS module, not a module pattern from Javascript.

RyanCavanaugh commented 2 years ago

Large namespace-based projects are increasingly rare; this isn't an area of future investment for us.

gulbanana commented 2 years ago

That’s a shame. Namespaces still have the major advantage of not needing a bundler; they let you write Typescript for the browser without requiring extra tools to turn it into a single file that the browser can (efficiently) consume. I wish this kind of feature was considered significant, because for my team it’s a major draw of the ecosystem - in particular, not having to use npm in our build process is very valuable.