eclipse-archived / ceylon

The Ceylon compiler, language module, and command line tools
http://ceylon-lang.org
Apache License 2.0
399 stars 62 forks source link

ceylon assemble #6712

Closed gavinking closed 7 years ago

gavinking commented 8 years ago

Currently ceylon fat-jar is a great assembly tool that lets you run a Ceylon app using javac. It has one limitation: it implies --flat-classpath, and doesn't use JBoss Modules for classloader isolation.

We need a similar ceylon assemble command that similarly produces a fat jar, but with "russian doll" packaging, and which uses JBoss Modules for classloading.

This, together with #5955 satisfies the need for "assemblies".

gavinking commented 8 years ago

@quintesse do you think this is something that you might be able to work on Tako?

quintesse commented 8 years ago

I can certainly try :)

gavinking commented 8 years ago

That would be awesome. Let me know if you have questions about what precisely this should do.

quintesse commented 7 years ago

Btw @gavinking you mention "russian doll packaging", does that mean you envision assemblies containing other assemblies?

I would have thought the embedding of .car files would only be one level deep? That is, we get all the necessary modules, including system modules and stuff their .car files in a .jar file with some special class loading.

The fact that an assembly would also include all the necessary system modules would make it strange to allow nesting. At least it seems that way to me.

But...

Are we sure we want assemblies to be completely independent artifacts just like fat jars are? It will make it hard to make any kind of composition of assemblies and modules. If they did not include system modules and would still depend on Ceylon being installed on your system they would perhaps be more flexible? And in that case you could probably more easily make an assembly of assemblies.

quintesse commented 7 years ago

Leaving these links here for research purposes:

gavinking commented 7 years ago

Btw @gavinking you mention "russian doll packaging", does that mean you envision assemblies containing other assemblies?

No, all I mean is that you would have a fat jar that contains .car files, and whatever else is needed to actually execute the module we're packaging using JBoss Modules.

I would have thought the embedding of .car files would only be one level deep?

Right, exactly.

gavinking commented 7 years ago

P.S. I don't think we have any need for "assemblies" to be composable. By their very nature, assemblies are top level artifacts.

quintesse commented 7 years ago

Ok, thanks, all clear.

quintesse commented 7 years ago

Ok, I pushed my implementation of this to a branch so people can comment on what I created before merging it with master.

Introduction

So the most important part of this design is the realization that if we're talking about putting a bunch of .car (and .jar) files in a single file and calling it an Assembly then basically assemblies are zipped/jarred repositories.

From that came the idea that if they are "basically zipped repositories" why not make them actual repositories? Meaning that zipped repositories would be first class citizens of the CMR. So that's the first pillar:

Zipped repositories

So the Ceylon now support a new special kind of Assembly Repository that lets it look up modules in a zip or jar file. The syntax for the repository URI is the following:

assembly:path/to/a/zip/or/jar/containing/modules.zip[!subfolder]

So using the helloworld example code from the samples folder in the Ceylon distribution we can now literally do:

$ ceylon compile com.example.helloworld
$ cd modules
$ zip -r ../modules.zip *
$ cd ..
$ rm -rf modules # just to show the module isn't read from the modules folder
$ ceylon run --rep assembly:modules.zip com.example.helloworld
Hello, World!

By default Ceylon searches for modules in the root of the zip/jar file, you can change this behaviour by adding the name of a sub folder to search to the end of the assembly path separated by a exclamation mark (!). So to do the same as above a bit more simply we can do:

$ ceylon compile com.example.helloworld
$ zip -r modules.zip modules
$ ceylon run --rep assembly:modules.zip!modules com.example.helloworld
Hello, World!

Of course we're not supposed to go around manually creating our own zip files, but it certainly an interesting feature to have and it's the basis for the second pillar:

Assemble an assembly

Instead of creating your own zips and perhaps having to figure out which modules to include and which to leave out you can use the new ceylon assemble command that was specifically created for that purpose. In its most basic form it works just like the ceylon fat-jar command:

ceylon assemble <module>

This creates a module-version.cas file in the local directory.

NB: Yes, a CAS file. Where JAR stands for Java ARchive, and CAR stands for Ceylon ARchive CAS stands for Ceylon ASsembly, logically

This CAS file by default will only contain the module mentioned in the assemble command together with the dependencies explicitly mentioned in its module descriptor and the explicit dependencies of those dependencies etc.

So using this new command the above example now becomes:

$ ceylon compile com.example.helloworld
$ ceylon assemble com.example.helloworld
$ ceylon run --rep assembly:com.example.helloworld-1.0.cas!modules com.example.helloworld
Hello, World!

But of course this isn't much better. But luckily the assemble command isn't just a simple archiver, it doesn't just zip the modules, it also adds some meta-information that the ceylon run command can use to make executing an assembly a lot easier. So that's where we come to the third pillar:

Running assemblies

The ceylon run command has been improved and knows about assemblies. There's a new --assembly option that's the equivalent of Java's -jar option and it works like this:

ceylon run --assembly <path-to-assembly>

So again using the example above we can now do the following:

$ ceylon compile com.example.helloworld
$ ceylon assemble com.example.helloworld
$ ceylon run --assembly com.example.helloworld-1.0.cas
Hello, World!

Now this is much simpler! The assemble command has stored some important information in the CAS file's MANIFEST.MF telling the run command where to look for the modules and which module is the "main" module that should be executed.

NB: in all the above cases the assembly is run using JBoss Modules with full module isolation. The only difference from the norm is that we have a new kind of repository.

So these are the basics for creating and using some very light-weight assemblies. They are as small as possible and can easily be distributed and used in places where They could possibly become Ceylon's WAR files where an application server could provide a single Ceylon environment where multiple application assemblies could be installed.

But what if you want to use them in places where you do not have Ceylon installed? Well we can do that too.

Stand-alone assemblies

There are actually two ways of creating stand-alone assemblies that don't need a local Ceylon installation to run. The first option is very similar to the situation we have when using fat jars:

ceylon assemble --include-language <module>

This creates an assembly that, besides the main module and its explicit dependencies, also includes the Ceylon language module and the minimal set of Ceylon supporting system modules to be able to run with only Java installed on the system. But like fat jars this means running with a flat classpath and with a dynamic meta model. But it works and results in an assembly that's only somewhat bigger (~2MB).

$ ceylon compile com.example.helloworld
$ ceylon assemble --include-language com.example.helloworld
$ java -jar com.example.helloworld-1.0.cas
Hello, World!

But what if you want a stand-alone assembly but still want to use the full module isolation that JBoss Module provides? Well for that we have a different option:

ceylon assemble --include-runtime <module>

This creates an assembly that, besides the main module and its explicit dependencies, also includes as much of the Ceylon system modules to be able to run using JBoss Modules. It does result in an assembly that's quite a bit bigger (~7MB).

$ ceylon compile com.example.helloworld
$ ceylon assemble --include-runtime com.example.helloworld
$ java -jar com.example.helloworld-1.0.cas
Hello, World!

Advanced topics

Caveats

There is no support for Maven dependencies. (Things will work when run using ceylon run but the dependencies won't be part of the assembly itself. Not sure if that's a realistic future expansion)

Comments?

So what do you guys think? @gavinking @FroMage @tombentley @chochos @davidfestal @bjansen @lucaswerkmeister @jvasileff I think this does everything that was expected for this issue while perhaps adding some additional interesting possibilities. But perhaps you guys can think of some improvements? Maybe you don't agree with some of the decisions I made?

Edit: changed # to !

gavinking commented 7 years ago

Wow, this looks fantastic @quintesse! I love it, and can't wait to try it out.

I have three minor comments:

  1. I did not find the syntax assembly:com.example.helloworld-1.0.cas#modules especially intuitive. In particular, I thought it was sorta weird that the repository would not be in the root directory of the .zip file by default. Furthermore I thought # was a strange choice of separator. Java uses ! for that. Are there some other tools that use #?
  2. I wondered if it would be better to have a separate command, ceylon run-assembly instead of ceylon run --assembly. Perhaps I'm wrong about that, but it's not clear.
  3. Having no support at all for maven dependencies seems like a pretty major limitation. How hard would it be to remove that limitation? Could we use the same trick, building a local maven repo into the .zip?
lucaswerkmeister commented 7 years ago

If an assembly is essentially a repo-in-a-file, what about other backends? Could we also use this for JS or Dart programs?

gavinking commented 7 years ago

@lucaswerkmeister Our notion of an "assembly" is now somewhat open ended. We have several assembler tools already, listed here:

https://ceylon-lang.org/documentation/1.3/tour/modules/#assembler_tools

That is, we already consider fat jars, wars, and even Jigsaw mlib folders as "assemblies". But because each of these scenarios has quite different requirements in terms of the actual produced artifact, they're all distinct tools.

Could we also use this for JS or Dart programs?

We already have a ceylon assemble-dart tool, and AFAIK that already does everything that's needed. So I guess it's not useful for Dart.

On the other hand, what would be the use of a zipped up archive full of .js files? Well, run it directly with ceylon run-js, I suppose. I guess that's useful, for the same reason that a zipped up archive fill of .cars is useful. I imagine it would not be that difficult for @quintesse to adapt his work to also handle JavaScript assemblies. But I'll let him comment on that.

quintesse commented 7 years ago

@lucaswerkmeister that should definitely be possible, the repository itself is completely backend-neutral (just that right now it's hard-coded to only look at .jar and .car files)

@gavinking;

  1. No special reason whatsoever, it could easily be changed to !
  2. Well the thing is that the tool would basically be the same code for 99%, second because I think at some point we want less tools, not more (each backend already adds several with compile, compile-js, compile-dart, run, run-js, run-dart, test, test-js, test-dart etc, why add even more?) and finally because it looked more like java -jar
  3. I'm not sure if it would be hard. If we could have a way of "auto-importing" them as Ceylon modules it would perhaps not be that hard, but adding a Maven repo to the assembly seems pretty difficult. But perhaps @FroMage might have some ideas, he has a better idea of what the whole Aether stuff supports.
quintesse commented 7 years ago

Oh @gavinking btw, the whole idea of using # (or !) is to be able to specify where the modules get stored. So by default it will look in the root, but if you decided to put them in a folder called foo/bar then you can. This is because by default it's much easier to type zip -r modules.zip modules which means all your modules will be in a modules subfolder. It's just to make it possible and not have to force people into a strict layout.

In the end the whole thing gets hidden behind the tools that will know how to deal with assemblies directly anyway (like the run tool). So in general people will never see nor have to deal with those repository URLs themselves.

jvasileff commented 7 years ago

If an assembly is essentially a repo-in-a-file, what about other backends? Could we also use this for JS or Dart programs?

That sounds nice, although an advantage of assemble-dart is that it creates a standalone Dart program–one that doesn't require Java. It basically packages all module dependencies into a Dart repo.

gavinking commented 7 years ago

Well the thing is that the tool would basically be the same code for 99%,

Do all the options that make sense for ceylon run also make sense for ceylon run --assembly?

I mean, to begin with, ceylon run takes a module name, whereas ceylon run --assembly accepts a file name. So they don't really have the same syntax it seems to me...

adding a Maven repo to the assembly seems pretty difficult

Why exactly? Because Aether can't look inside a zip file? Or something else?

gavinking commented 7 years ago

by default it will look in the root

OK, good, fine.

lucaswerkmeister commented 7 years ago

although an advantage of assemble-dart is that it creates a standalone Dart program–one that doesn't require Java.

That touches another thing I wanted to ask. ceylon fat-jar results in a file that I can just stick into java -jar, without worrying about whether or where Ceylon is installed on the target system. Is the same possible for an assembly – can it also bootstrap itself? Or does it need ceylon run?

quintesse commented 7 years ago

Is the same possible for an assembly

@lucaswerkmeister you didn't read what I wrote above entirely, did you? ;)

lucaswerkmeister commented 7 years ago

Hey, what did you expect when you posted that comment at 3AM and CCed me into it? I diligently read it, and by the time I got to the end, obviously I was so sleepy that I completely forgot that last part :D

Next question, then: What’s the difference between ceylon assemble --include-language and ceylon fat-jar, if they both result in a flat classpath at runtime? Just the layout of the files (matryoshka vs. flat)?

gavinking commented 7 years ago

Next question, then: What’s the difference between ceylon assemble --include-language and ceylon fat-jar, if they both result in a flat classpath at runtime? Just the layout of the files (matryoshka vs. flat)?

fat-jar does not run anything on JBoss Modules.

lucaswerkmeister commented 7 years ago

Neither does ceylon assemble --include-language, according to @quintesse’s post – the JBoss version is --include-runtime.

quintesse commented 7 years ago

Just the layout of the files (matryoshka vs. flat)?

Indeed (well that and the fact that it contains a special class-loader to make it work)

FroMage commented 7 years ago

Really cool. But yeah we should support Maven modules in there. Since it's meant to be entirely self-sufficient with no downloads after the assembly, the only choice we have is packaging a Maven repo in there, and tweak the Aether resolver so that it looks things up in there. If it can deal with non-File repos.

quintesse commented 7 years ago

the only choice we have is packaging a Maven repo in there

Well or automatically turn them into Ceylon modules, generating a module.properties from the dependency info, right? O r do you think that's not possible?

If it can deal with non-File repos.

It's not really non-file. All "several JARs into one JAR" examples I could find really unpack everything into temp space and then delete that again on JVM exit, so that's what this does too.

FroMage commented 7 years ago

Well or automatically turn them into Ceylon modules

The semantics under JBoss Modules would change, especially with --export-maven-dependencies and --fully-export-maven-dependencies which only apply to Maven modules. Also, Maven modules have exclusion rules and auto-selection of latest versions but I suspect this is something you do when you do the assembly already, no?

It's not really non-file

OK so it surely can do that then.

quintesse commented 7 years ago

but I suspect this is something you do when you do the assembly already, no?

It uses the same Module Graph as the fat-jar command so I assume it does :)

[all the other Maven stuff]

Would it be possible to use the simpler MavenRepository for this? (The one you wrote a long time ago that simply works with local folders) or do we specifically need the AetherRepository for everything to work correctly?

For the rest I imagine that it means adding a mvnassembly: repository type if we want to keep the idea of assemblies being a type of zipped repository. It won't be as elegantly simple as it is right now though.

FroMage commented 7 years ago

It uses the same Module Graph as the fat-jar command so I assume it does :)

OK, so it should.

Would it be possible to use the simpler MavenRepository for this?

I guess…

For the rest I imagine that it means adding a mvnassembly: repository type if we want to keep the idea of assemblies being a type of zipped repository

No, what I'd do is to have two folders in the zip: ceylon and maven, no?

Or you turn them into Ceylon modules during assembly, and respect --auto-export-maven-dependencies and --fully-... during assembly time?

quintesse commented 7 years ago

No, what I'd do is to have two folders in the zip: ceylon and maven, no?

Sure, that's possible, but the assembly: repository is supposed to be a Ceylon repository, not a Maven one, that's why I had hoped that we could turn them into Ceylon modules (and add overrides for them). Although perhaps it could be made to work now that I think of it. I'll have to investigate a bit more.

jvasileff commented 7 years ago

Since it's meant to be entirely self-sufficient with no downloads after the assembly...

It would be nice to also have an option to include as little as possible, to minimize the size of standalone assemblies. (Whatever the user doesn't already have cached in ~/.ceylon/repo would of course have to be downloaded on first run.)

Leveraging the download-deps-at-runtime aspect of Ceylon's module system has really helped with the (WIP) VSCode plugin since Microsoft imposes a strict plugin distribution size limit. I am currently requiring that the user install Ceylon separately, but with this, perhaps even that requirement could be removed.

chochos commented 7 years ago

Sounds great!

For JS backend, I guess the format of the cas file includes all JS modules under their respective paths, so for web apps or whatever it's just a matter of unzipping it and it all falls into place. For web apps, the repo endpoint that serves modules should just know about assemblies and treat them like another repo, which would make things a lot easier for web deployment.

quintesse commented 7 years ago

It would be nice to also have an option to include as little as possible

Well, like I explained you can have assemblies that only contain application code, no Ceylon system modules at all. But that of course means the user does need Ceylon installed. But you could of course combine it with ceylonb to install Ceylon automatically. (I've even thought about adding the option to include this bootstrap into the assembly).

The smallest option (so far) that doesn't need the user to have anything installed (besides Java) is the --include-language one, but that's basically the same as running a fat jar.

jvasileff commented 7 years ago

Well, like I explained you can have assemblies that only contain application code

That's not really what I meant. In fact, this wouldn't necessarily even contain application binaries if they were available on Herd. Although I suppose I might be able to just take the output of an --include-language assembly and delete the stuff I want to be downloaded at runtime.

Taking this to the extreme, and a bit off topic, would be a Ceylon runner jar that would let you run Ceylon programs without installing Ceylon. Like:

java -jar ceylonRunner.jar ceylon.formatter/1.3.1
# same as 'ceylon run ceylon.formatter/1.3.1'

The difference between this and a "bare bones" assembly would be that the assembly would 1) know what to run, and 2) possibly embed some dependencies. Edit: and 3) some support for overrides.xml.

Regardless, great work. ceylon assemble is going to be really nice.

quintesse commented 7 years ago

a Ceylon runner jar that would let you run Ceylon programs without installing Ceylon

But isn't this what ceylonb basically does? Except that instead of java -jar ceylonRunner.jar you type ceylonb run.

jvasileff commented 7 years ago

Not totally the same–there are some other "except that"s.

xkr47 commented 7 years ago

Do assemblies supersede ceylonb?

luolong commented 7 years ago

@xkr47 I do not believe it does.

ceylonb is development time tool for building, testing and generally interacting with Ceylon compiler toolset while developng Ceylon projects.

assemble on the other hand, creates a self-sufficient executable binary artifact...

quintesse commented 7 years ago

@xkr47 They're two different things.

ceylonb is an auto-installer for Ceylon, making it very easy to be up-and-running with a project without having Ceylon installed.

Assemblies are basically just zipped repositories. An easy way to distribute applications. You'd still need Ceylon to use them unless you use of of the stand-alone options to make them usable without having Ceylon installed.

The basic (non stand-alone) assemblies could work very well together with ceylonb, one making it easy to distribute the application while the other makes it easy to distribute Ceylon itself.

quintesse commented 7 years ago

Merged this with master now. Will continue working on Maven on the branch.

gavinking commented 7 years ago

Excellent!

quintesse commented 7 years ago

Closing this now and opening a new issue for Maven support,