tindzk / seed

Build tool for Scala projects
https://tindzk.github.io/seed/
Apache License 2.0
237 stars 13 forks source link

Support scalajs-bundler (+ sbt-web-scalajs) workflow #72

Open nafg opened 4 years ago

nafg commented 4 years ago
  1. Allow generating a separate file for libraries vs. application
  2. Add an npmDeps setting
  3. Support setting bloop JS module kind to commonjs so @JSImport works
  4. Generate package.json and webpack config and run webpack (this could be a custom target if there were a way to get all transitive dependencies' npmDeps
  5. For libraries, create a file like scalajs-bundler does so that npm dependencies can be propagated
  6. JVM app should be able to "depend" on output of linking JS app so seed does it, and get the result in its classpath
nafg commented 4 years ago

Here is my WIP workaround for (3) and (6), as well as https://github.com/tindzk/seed/pull/71:

[project]
# ...
[module.appCommon.jvm]
# ...
resources = ["src/main/resources", "src/main/webapp", "target/resource_managed/main"]

[module.appCommon.target.copyJS]
command = """
cd ..
jq '.project.platform.config.version="0.6.31"' .bloop/appJs.json | sponge .bloop/appJs.json
jq '.project.platform.config.kind="commonjs"' .bloop/appJs.json | sponge .bloop/appJs.json
bloop link appJs
cd -

dir=target/resource_managed/main/public
mkdir -p $dir
cp $BUILD_PATH/appJs.js $dir/
"""

Note this is in a subdirectory due to the issue mentioned here: https://github.com/tindzk/seed/issues/48#issuecomment-564793868

tindzk commented 4 years ago

Thanks for your suggestions! I am not familiar with scalajs-bundler, so I will have to better understand the workflow first.

  1. Is the goal to create separate JavaScript artefacts for your modules and emit them in the CommonJS format?
  2. Interesting idea. Will it have to do anything beyond downloading npm dependencies to node_modules/? Should the npm dependencies be defined on the module or on the project level? Looking at scalajs-bundler, they appear to be using npm/yarn under the hood. I'd rather avoid requiring another software installed on the host machine.
  3. This will be straightforward on the Seed side, but as far as I know CommonJS suport is still incomplete in Bloop (see https://github.com/scalacenter/bloop/pull/714).
  4. This sounds like we will need a separate command (seed js). I am not sure yet how this fits into the overall design of Seed since it positions itself as a Scala build tool. Is it an option to run npm as part of a shell script prior to seed run?
  5. How should the file look like?
  6. You can already change the path of the generated JavaScript file using the output setting. In your example, you could set it to target/resource_managed/main/public/appJs.js. It appears that we would also need a setting linkDeps (in analogy to moduleDeps), such that running app:jvm will link app:js. Another solution I was considering are simple compilation pipelines that are passed to the CLI: seed sequential 'link app:js' 'run app:jvm' Similarly, we could support parallel command execution: seed parallel 'link app:js' 'run app:jvm'

At the moment, there is no distinction between library and application modules in Seed. If I understood you correctly, it would only be of significance for Scala.js modules.

Could you research how other build tools (Mill, Gradle, Fury etc.) implemented the scalajs-bundler workflow?

nafg commented 4 years ago

Thanks for your suggestions! I am not familiar with scalajs-bundler, so I will have to better understand the workflow first.

Sorry about that

  1. Is the goal to create separate JavaScript artefacts for your modules and emit them in the CommonJS format?

The ultimate goal is to be able to consume javascript libraries that are published to npm, write facades that use them, and consume facades from libraries or other modules without having to repeat the npm packages' coordinates.

This means that 1) Something has to generate a package.json file and invoke npm (or yarn) 2) Scala.js has to emit Common JS module format so that @JSImport can compile to javascript import statements 3) Something has to generate a webpack config file and run webpack (or rollup or any module bundler) so that everything can be turned into something the browser can run. Although, modern browsers can consume modules directly. I'm not sure how wide support is though. 4) For bonus points, copy scalajs-bundler's optional modes which makes things much faster. In LibraryOnly mode it won't run webpack on the project's code, only the npm modules, so it doesn't have to run every time or on as much code. Shimming the application to run in the browser is easy enough for it to do without using webpack. (There's also LibraryAndApplication, which does the above but then concatenates everything into one file so you don't need 3 script tags.)

  1. Interesting idea. Will it have to do anything beyond downloading npm dependencies to node_modules/? Should the npm dependencies be defined on the module or on the project level? Looking at scalajs-bundler, they appear to be using npm/yarn under the hood. I'd rather avoid requiring another software installed on the host machine.

There isn't really a choice for modern frontend development. I mean I suppose it can't be that hard to scrape packages from the npm registry directly (there's a URL that gives you a tarball), so if you'd rather do it that way I won't complain. ;)

  1. This will be straightforward on the Seed side, but as far as I know CommonJS suport is still incomplete in Bloop (see scalacenter/bloop#714).
  2. This sounds like we will need a separate command (seed js). I am not sure yet how this fits into the overall design of Seed since it positions itself as a Scala build tool. Is it an option to run npm as part of a shell script prior to seed run?

Maybe for now we should figure out a way to pass all the build info to a custom target so that this can be implemented outside the tool?

  1. How should the file look like?
unzip -p ~/.coursier/cache/v1/https/jcenter.bintray.com/io/github/nafg/scalajs-facades/react-select_2-4-2_sjs0.6_2.12/0.6.2/react-select_2-4-2_sjs0.6_2.12-0.6.2.jar NPM_DEPENDENCIES 
{"compile-dependencies":[{"react-select":"2.4.2"}],"test-dependencies":[{"react-select":"2.4.2"}],"compile-devDependencies":[],"test-devDependencies":[]}
  1. You can already change the path of the generated JavaScript file using the output setting. In your example, you could set it to target/resource_managed/main/public/appJs.js.

Would it have to start with '..'? Because the default is to put it in build/, no?

It appears that we would also need a setting linkDeps (in analogy to moduleDeps), such that running app:jvm will link app:js. Another solution I was considering are simple compilation pipelines that are passed to the CLI: seed sequential 'link app:js' 'run app:jvm'

That doesn't seem much better than seed link app:js && seed run app:jvm since seed starts so fast (and I imagine will get faster)

Similarly, we could support parallel command execution: seed parallel 'link app:js' 'run app:jvm'

How about make parallel the default, assuming you have a way to know when one subcommand ends and the next starts. If you want to add sequentially (sync points) there could be a delimiter.

Or if you don't have a generalized way to know where a subcommand ends, use delimiters for both. For example, . to delimit tasks within the same "stage" and .. to delimit "stages" (gitlab CI uses the term stage to group jobs that run in parallel and separate them into sequential steps).

So you could do e.g. seed build module1 . test module2 .. run module3 and it will do the first two tasks in parallel, wait for both to finish, then run module3

At the moment, there is no distinction between library and application modules in Seed. If I understood you correctly, it would only be of significance for Scala.js modules.

Could you research how other build tools (Mill, Gradle, Fury etc.) implemented the scalajs-bundler workflow?

Mill I've worked with. There's an old PR for support which is outdated and I had to finish myself (A version of that is here: https://gist.github.com/nafg/6ecce298a0a20f1e4a259cdae5634060). At least it exposes enough to implement it.

Gradle I would guess does not support it and you have to manage your own package.json and webpack config files and run the commands manually. I can't recall reading anything about using gradle with scala.js at all, in fact, although I haven't looked into it.

Fury I would also guess does not support it, first of all it's barely installable. I have not seen @propensive mention anything about it in any talks I watched; in fact he's said it would not do lots of things or allow plugins. He's a fan of good old Makefiles for things outside the problem of compiling and sharing scala code. Maybe he can chime in.

nafg commented 4 years ago

I mean I suppose it can't be that hard to scrape packages from the npm registry directly (there's a URL that gives you a tarball), so if you'd rather do it that way I won't complain. ;)

You would have to get transitive dependencies so you'd be writing a dependency manager of course. Someone wanted to do this: https://stackoverflow.com/questions/36002732/java-plugin-for-installing-npm-modules

tindzk commented 4 years ago

Thanks for the detailed answer!

  1. Something has to generate a package.json file and invoke npm (or yarn)

Is the package.json file only needed by npm/yarn in your workflow? If so, we could avoid this step if we fetched the packages ourselves.

Generating package.json is problematic since this file is typically created manually and may already exist in the project folder. Furthermore, it is likely we will have to provide ways to set its other fields besides the dependencies.

  1. Scala.js has to emit Common JS module format so that @JSImport can compile to javascript import statements

Does CommonJS work with Bloop already?

  1. Something has to generate a webpack config file and run webpack (or rollup or any module bundler) so that everything can be turned into something the browser can run. Although, modern browsers can consume modules directly. I'm not sure how wide support is though.

  2. For bonus points, copy scalajs-bundler's optional modes which makes things much faster. In LibraryOnly mode it won't run webpack on the project's code, only the npm modules, so it doesn't have to run every time or on as much code. Shimming the application to run in the browser is easy enough for it to do without using webpack. (There's also LibraryAndApplication, which does the above but then concatenates everything into one file so you don't need 3 script tags.)

We should implement this outside of Seed given that all major browsers have been supporting modules for 1-2 years already.

There isn't really a choice for modern frontend development. I mean I suppose it can't be that hard to scrape packages from the npm registry directly (there's a URL that gives you a tarball), so if you'd rather do it that way I won't complain. ;)

If it is only about fetching the packages, we can reimplement the logic. Otherwise, we would have to add npm/yarn to the Docker image which increases the image size and the attack surface.

npmDeps could be a setting made available on all JavaScript modules. When the developer runs seed bloop or seed idea, we would fetch the npm dependencies too.

Generating the NPM_DEPENDENCIES file as in your example should be straightforward to implement. Will this file only be needed during publishing?

Can we get the transitive dependencies of Scala.js façades from the registry API or will we have to read them from the NPM_DEPENDENCIES file?

Maybe for now we should figure out a way to pass all the build info to a custom target so that this can be implemented outside the tool?

Agreed. Generally, we are better off improving abstractions that would allow developers to express custom workflows.

Would it have to start with '..'? Because the default is to put it in build/, no?

No. The default path is build/, but you can specify an arbitrary path in your module configuration. The logic is defined as follows.

That doesn't seem much better than seed link app:js && seed run app:jvm since seed starts so fast (and I imagine will get faster)

As for sequential execution there is little benefit indeed, unless combined with the parallel operator.

How about make parallel the default, assuming you have a way to know when one subcommand ends and the next starts. If you want to add sequentially (sync points) there could be a delimiter.

Using operators is a good idea. We could use ; for parallel execution as developers will already be familiar with it from sbt:

seed 'build module1; run module2'

For parallel execution, | and || are often used in process calculi:

seed 'build module1 | test module2; run module3'

| would have a higher precedence over ;.

What I am ultimately hoping to achieve is something along these lines:

seed 'link --watch app:js | run --watch app:jvm'

This is a fairly common use case. Currently, we need two separate Seed instances running in two terminals. This would have a significantly lower memory footprint, and better reuse resources as only a single Bsp connection is needed.

nafg commented 4 years ago

Is the package.json file only needed by npm/yarn in your workflow? If so, we could avoid this step if we fetched the packages ourselves.

And all its transitive dependencies, yes.

Generating the NPM_DEPENDENCIES file as in your example should be straightforward to implement. Will this file only be needed during publishing?

I think so

Can we get the transitive dependencies of Scala.js façades from the registry API or will we have to read them from the NPM_DEPENDENCIES file?

Scalajs facades are not on npm, so I guess the latter

Using operators is a good idea. We could use ; for parallel execution as developers will already be familiar with it from sbt:

seed 'build module1; run module2'

For parallel execution, | and || are often used in process calculi:

seed 'build module1 | test module2; run module3'

| would have a higher precedence over ;.

I think an operator that doesn't require quoting is better for usability