Support code generators

kubukoz commented 2 years ago

Is your feature request related to a problem? Please describe.

Build tools generally provide ways to generate source files before compilation. scala-cli doesn't, it was previously mentioned it potentially could.

Describe the solution you'd like

Some way to run an arbitrary script before every compilation, probably configured in directives or by convention (i.e. putting a script in a ./scala-scripts directory).

The interface would be basically Unit => Unit, maybe with extra context passed as environment variables.

Some possible things I'd like to see passed:

a way to gather all the sources used in the build (e.g. :-separated list of source paths)
the path to a directory where I can output files that'll be included in the compilation. Alternatively I could write to anywhere I want and these paths would be added manually in a directive. Not sure how this would work with cross-compilation.
pwd (or something like a workspace root, in case of bsp? I don't know how exactly that protocol works though)
build metadata (in theory I could read the directives myself, but having this passed would be much better), e.g. deps, resolved scala version, target platform)

Note that this would be triggered on IDE compilations as well.

There should be a way to customize the script beyond the runnable name. Ideas:

Passing args within the directive, e.g. //> build script echo foo bar
Wrapping the script with another, and using the other script in the directive. E.g. script echow runs echo foo bar, and I do //> build script echow

Unfortunately, any kind of scripts would mean that builds can't be shared via e.g. gists. Personally I'm fine with this.

Describe alternatives you've considered

Making my own build tool wrapping scala-cli :)

Additional context

Not much to write here.

ckipp01 commented 1 year ago

Just to tie these together I just had a usecase where having something like this would have been super useful. I wrote about it in here, but to reiterate I essentially needed a BuildInfo that held the version of my app so that it could be displayed to users. Since I use a Makefile for the project I was able to make sure that any command ran a script before running the actual command to compile. This sort of works fine until you get out of the context of the Makefile. For example I just realized now that Scala Steward can no longer run on my project because of this. Having something like the approach outlined by @kubukoz would really help in situations like this.

slabuz commented 1 year ago

Hi @kubukoz

After a little cooperation with the scala-cli team, I've come up with a proposal on how to define code generators. You can find it here https://gist.github.com/slabuz/b66432d9c71dd100d193617754c79911. In my proposal, I focused on providing a code generator for protobuf. Let me extend it with a few words of commentary.

Where do generators come from? The idea is to include some of the popular generators in scala-cli and keep extending the library. In the final version, users will be able to provide new generators locally or from gist.

How are they written? They can be written using any version of scala, not necessarily the same as the main code version. They can use external dependencies, just like normal scala-cli code.

When do they run? The code generation step will be an obligatory part of the compilation process, run before it to make sure that all generated sources are in place and up to date. In addition, code generation can be triggered automatically when using code editors such as IntelliJ. For those writing code without such tools, a new step is added, scala-cli generate, as shown in the example.

How is the code structured? At this point, we have identified 2 main aspects of the generator API. The first is a way for the generator to describe itself, giving the most useful data. In my example it's just a JSON, but in the end there will be a case class definition that every generator will need to instantiate and return. The second part of the API is an actual method to generate source code, given the source file and output location.

We are open to any constructive criticism and suggestions on how to make this solution even better ;)

bishabosha commented 1 year ago

Note that bloop also has integrated support for execution of source generators - and it is aware of project dependencies and is cached: https://github.com/scalacenter/bloop/pull/1774, https://github.com/scalacenter/bloop/pull/1819, https://github.com/scalacenter/bloop/pull/1784

tgodzik commented 1 year ago

Note that bloop also has integrated support for execution of source generators - and it is aware of project dependencies and is cached: scalacenter/bloop#1774, scalacenter/bloop#1819, scalacenter/bloop#1784

Yes, the plan would be to use that.

kubukoz commented 1 year ago

The plan looks great, would love to see it :) let me know once I can try integrating https://github.com/disneystreaming/smithy4s/

przemek-pokrywka commented 1 year ago

I think that code generation is clearly behind the ideal scope of scala-cli, because it makes it very difficult to define a clear feature set of the tool. Clear definitions are essential for anyone who comes to learn about stuff. An ocean of idiosyncrasies is the worst thing to confront.

Maybe if the tool allows for a hook, like in Cargo (https://web.mit.edu/rust-lang_v1.25/arch/amd64_ubuntu1404/share/doc/rust/html/cargo/reference/build-scripts.html#build-scripts - the script would need to be Scala to exclude OS differences etc) then the damage could be contained. But, again, what is the new clear definition of the scala-cli? How do you explain it briefly to newcomers / other-lang-refugees?

przemek-pokrywka commented 1 year ago

To add some constructiveness to the criticism above, in my opinion it would be good to make scala-cli a well-behaving component of arbitrary systems larger than itself. I'm thinking of things like Bazel or Nix.

Luka-J9 commented 1 year ago

I'd love to see this feature. I would look at how Rust/Bleep designed their solution. Bleep is especially interesting due to the notion that it has some interop with sbt plugins. From a design perspective I also like how the configuration is also handled, as having a one liner with lots of configurations (what I understand the current proposal to be) can end up being cumbersome. However I also understand that file formats like yaml/toml would be a larger departure from how scala-cli currently functions (although it might be worth revisiting in this light?)

I disagree with the notion that this is outside the ideal scope of scala-cli, to me it seems like a natural progression. And wedging it into a larger system like Nix or Bazel raises the barrier to entry for newcomers in an unnecessary way in my opinion.

Currently scala-cli has the concept of exporting to Mill/Sbt for when build requirements become sufficiently complex that scala-cli no longer becomes the appropriate tool. Adding code generation would allow users to stay on scala-cli for longer before needing to resort to such an option. While ejecting into a different build tool is a fine option for those of us who are familiar with sbt or mill, thinking from the perspective of a newcomer I would think it would be frustrating to have to learn a completely different tool to achieve functionality like "I want to generate code from my protobuf" or "I want a access Buildinfo."

He-Pin commented 10 months ago

Will it support protobuf code generation?

tgodzik commented 9 months ago

Yes, this is the intention and basic feature we want to support

bishabosha commented 5 months ago

I would like to propose this as a GSOC project under Scala org, if anyone wants to object

Edit: It is now being worked on by Rizky Maulana @Perklone

przemek-pokrywka commented 1 month ago

Hi, seeing the "manual code-gen directive" in action changed my mind as it's much better to support the popular use cases in a standard way rather than forcing users to hack their workarounds.

It would be indeed very helpful to have the ability to depend on code that would be generated in the process of building the script/app.

The main question would be how to implement it in a sound, pragmatic, and ergonomic way. If we tried to formalize @WojciechMazur's proto-directive (naming mine), it might look like this:

//> generate --channel https://disneystreaming.github.io/coursier.json smithy4s generate --dependencies com.disneystreaming.smithy:aws-dynamodb-spec:2023.02.10 -o ./handlers/wildrides

so (provided the code generator exists somewhere as a binary) the main Scala-CLI script could even stay stand-alone / as a gist, easily copy-and-paste-able wherever necessary.

If we wanted the code generation to be sound, I'd propose to treat the generator's output directory as a dependency (writable by the generator only).

There are multiple questions about the interface exposed by the generator to Scala-CLI. How would Scala-CLI know what is the output directory etc?

VirtusLab / scala-cli

Support code generators #610